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ABSTRACT 


V  Time  series  models  with  autoregressive,  moving  average  and  mixed 
autoregressi ve-moving  average  correlation  structure  and  with  symmetric, 
heavy-tailed,  non-normal  marginal  distributions,  called  l -Laplace,  are 
considered . 

First,  a  flexible  mixed  model  NLARMA(p,q)  with  Laplace  (double 
exponential)  marginals  is  investigated.  The  correlation  structure  for 

_  y 

several  special  cases  is  derived.  The  innovation  sequence  for  the 
second-order  autoregressive  case,  NLAR(2),  is  derived.  Parameter 
estimation  in  the  NLAR( 1 )  models  is  discussed  in  terms  of  moments,  least 
squares  and  maximum  likelihood. 

>  Second,  a  family  of  continuous  random  coefficient  models  with 

;  / 

l-Laplace  distributions  are  examined.  The  Z-Laplace  distribution  is 
described  along  with  a  useful  transf ormation.  The  correlation  structure 
for  special  cases  is  derived.  For  a  special  case  when  l  is  one,  the 
BELAR(I)  model  with  Laplace  marginals,  the  maximum  likelihood  estimator 
of  serial  correlation  is  derived.  Least  squares  estimates  are  also 
derived  using  the  concept  of  a  linear  residual.  These  estimators  of 
correlation,  along  with  other  estimators  of  location  and  scale  are 
compared  in  a  small  simulation  study. 

)  Thirdly,  the  NLAR(1)  and  the  BELAR(I)  processes  are  compared  using 
higher  order  residual  analyses  based  on  the  uncorrelated,  but  dependent 

/<b  r.  ^ 

linear  residuals,  tR  f ., 

h. 

Finally,  open  problems,  as  well  as  possible  extensions  and 
applications  of  the  analyses  given  in  this  thesis  are  discussed. 
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I.  INTRODUCTION 


In  standard  time  series  analysis,  one  assumes  the  marginal 
distributions  of  { Xn }  are  Normal,  i.e.  Gaussian.  However,  a  Gaussian 
distribution  will  not  always  be  appropriate.  In  earlier  works  by  Gaver 
and  Lewis  [Ref.  1];  Jacobs  and  Lewis  [Refs.  2,33;  and  Lawrance  and  Lewis 
[Refs.  4,5,6],  stationary  non-Gaussian  time  series  models  were  developed 
for  variables  with  positive  and  highly  skewed  marginal  distributions. 

There  still  remain  other  situations  for  which  Gaussian  marginals  are 
inappropriate,  i.e.  the  marginal  time  series  variable  being  modelled, 
although  not  skewed  or  inherently  positive  valued,  has  a  large  kurtosis 
or  long-tailed  distribution.  The  position  errors  in  a  large  navigation 
system  have  such  a  distribution.  In  particular,  Hsu  [Ref.  7]  modelled 
pooled  position  errors  using  the  double  exponential  distribution  (also 
called  the  Laplace  distribution).  Also  McGill  [Ref.  8]  showed  that  the 
Laplace  distribution  provides  a  characterization  of  the  error  in  a 
timing  device  under  periodic  excitation.  Speech-waves  are  modelled 
using  Laplace  variables  (Davenport  [Ref.  93).  In  the  "speech-like" 
process  given  by  the  linear  AR(1)  model 


*„  ■  *  <1  -  ),/2E„ . 


(1.1) 


where  .8  S  c  S  .9,  the  innovation  sequence  {E^}  is  i.i.d.  Laplace  (Linde 
and  Gray  [Ref.  10]).  In  image  coding  systems  using  a  two-dimensional 


discrete  cosine,  DC,  transform,  Reininger  and  Gibson  [Ref.  11]  showed 
that  the  Laplace  distribution  gives  the  best  approximation  to  the 
distribution  of  the  non-DC  coefficients.  Recently  Sethia  and  Anderson 
[Ref.  12]  required  a  stationary  autoregressive  process  with  Laplace 
marginals  in  their  research  in  communications  technology. 

Even  before  Gaver  and  Lewis  [Ref.  1]  wrote  the  pioneering  paper  on 
the  subject  of  autoregressi ve  processes  with  a  specified  non-Normal 
marginal  distribution,  Gastwirth  and  Wolff  [Ref. 13]  had  derived  a 
solution  to  the  linear  additive  first-order  difference  equation 


X  -  pXn  +  E  .  (1.2) 

n  n-1  n 

for  which  { }  is  marginally  Laplace.  This  result  was  used  later  by 
Gastwirth  and  Rubin  [Ref.  14]  within  the  context  of  robust  estimation  on 
dependent  data.  This  solution  to  (1.2)  is  here  called  the  Laplace 
First-order  Autor egr essi ve  Process  ( LA R ( 1 ) ) .  The  early  solution  of 
(1.2)  is  mentioned  at  this  point,  merely  to  further  substantiate  the 
claim  that  non-Normal,  heavy-tailed  di stri butions  are  of  interest. 

In  this  thesis,  several  time  series  models  with  a  specified 
symmetric,  heavy-tailed  marginal  distribution  are  presented.  This 
distribution,  called  the  i-Laplace  distribution,  includes  the  Laplace 
distribution  as  a  special  case.  The  approach  in  Chapter  II  extends  the 
discrete  random  coefficient  model  of  Lawrance  and  Lewis  [Ref.  6],  New 
Exponential  Autoregressive  Moving  A ver age--NE ARMA ( p , q ) ,  to  the  case 
where  the  marginal  distribution  is  Laplace,  also  called  double 
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Exponential.  This  class  of  models  is  called  The  New  Laplace 
Autoregressive  Moving  Average  model,  NLARMA(p,q).  Several  special  cases 
of  NLARMA(p,q)  are  individually  researched.  The  second-order 
autoregressive  model,  NLAR(2),  is  established  by  showing  the  conditions 
for  existence  and  uniqueness  and  by  specifying  the  innovation  structure. 
The  correlation  structure  of  NLAR(2)  is  also  given  along  with  results 
concerning  directional  moments  and  partial  time  reversibility. 

For  the  case  when  p  =  1  and  q  =  0,  called  NLAR(1),  the  distribution 

of  the  difference  X  -X  ,  is  derived,  providing  some  insight  into  the 

n  n-1 

nature  of  the  differenced  NLAR(1)  model.  The  conditional  density  of  X^ 
given  X  .j  is  also  derived,  which  leads  to  a  brief  investigation  of  the 
likelihood  function.  Parameter  estimation  in  NLAR(1),  however,  is 
limited  to  comparisons  of  the  moment  estimators  and  the  least  squares 
estimators  for  the  independent  model  parameters  of  serial  correlation. 

The  correlation  structure  is  derived  for  other  models  in  the 
NLARMA(p.q)  family:  the  first-order  moving  average  called  NLMA(l);  the 
first-order  mixed  model  called  NLARMA(1,1);  and  the  special  cases  of 
pth-order  autoregressive  models  called  TLAR(p)  that  are  analogous  to  the 
TEAR(p)  model  of  Lawrance  and  Lewis  [Ref.  6].  These  models  demonstrate 
the  flexibility  of  the  NLARMA(p,q)  family. 

In  Chapter  III,  a  family  of  stationary  time  series  is  developed 
using  continuous  random  coefficients  in  the  additive  difference  equation 
model.  The  marginal  distribution  is  specified  to  be  a  member  of  the  so- 
called  H-Laplace  distributions,  the  properties  of  which  are  described  at 
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the  beginning  of  the  chapter.  The  "square-root  Beta-Laplace"  transform 


is  defined.  It  is  used  to  formulate  the  8,-Laplace  time  series  models. 

For  the  special  case  when  l  =  1 ,  the  marginal  distribution  is  again 

Laplace.  The  autoregressive  model  is  called  the  Beta-Laplace  First- 

Order  Autoregressive  model,  BELAR(I).  The  conditional  density  of 

given  X  .  is  derived.  This  leads  to  the  derivation  of  a  likelihood 
n- 1 

function  and  a  numerical  technique  to  evaluate  and  maximize  the 
likelihood  function  with  respect  to  the  model  parameter  for  serial 
correlation . 

Several  facets  of  the  parameter  estimation  problem  are  investigated 
for  BELAR(I).  The  behavior  of  different  estimators  of  scale  and 
location  are  compared  using  the  Simulation  Testbed  (SIMTBED)  of  Lewis, 
Orav  and  Uribe  [Ref.  15].  The  least  squares  estimation  theory  is 
derived  around  the  concept  of  a  linearized  residual.  Asymptotic 
properties  are  derived  using  results  from  Nicholls  and  Quinn  [Ref.  16]. 
Robust  estimators  are  defined  and  simulated  in  SIMTBED.  Finally,  a 
numerical  scheme  for  finding  the  maximum  likelihood  estimator  of  serial 
correlation  is  used  in  a  small  simulation  study  of  the  small  sample 
properties  of  the  maximum  likelihood  estimator. 

In  the  last  section  of  Chapter  III,  a  first-order  moving  average 
model  is  discussed.  A  qt!l-order  moving  average  model  in  2,-Laplace 
variables  is  also  derived. 

The  random  coefficient  approaches  are  not  the  only  ways  to  generate 
Laplace  or  other  variables  with  a  specified  correlation  structure.  The 


literature  contains  numerous  articles  on  generation  of  random  sequences. 


One  approach  put  forth  in  several  papers  (Gujar  and  Kavanagh  [Ref.  17]; 
Haddad  and  Valisalo  [Ref.  18];  Li  and  Hammond  [Ref.  19];  Liu  and  Munson 
[Ref.  20];  Sondhi  [Ref.  21])  involves  passing  white  Gaussian  noise 
through  a  linear  filter  followed  by  a  zero  memory  nonlinear  transform. 
This  is  a  general  procedure  that  produces  exactly  the  required  marginal 
distribution  and  a  good  approximation  to  the  autocorrelation  structure. 
However,  the  scheme  lacks  the  simplicity  of  either  of  the  methods  being 
proposed.  Moreover,  the  filtering  approach  produces,  for  example,  in 
the  first-order  autoregressive  case,  only  one  process. 

It  is  important  to  note  that  in  non-Normal  time  series,  there  are 
infinitely  many  processes  with  a  given  marginal  and  a>'tocorr elation 
structure.  This  is  the  case,  for  example,  in  the  two-paramet er  NL A R (  1  ) 
process.  The  differences  in  these  processes  must  be  explored  through 
higher  joint  moments.  In  Chapter  IV,  residual  analyses  using  fourth 
joint  moments  are  derived.  The  ideas  are  modifications  of  those  from 
Lawrance  and  Lewis  [Refs.  6,  22],  who  accomplished  an  analysis  using 
joint  third  moments  within  the  NEAR  framework.  The  residual  analysis  is 
applied  to  show  the  differences  in  the  various  NLAR( 1 )  processes  and  the 
BELAR(I)  process. 

In  Chapter  V,  open  problems  and  possible  extensions  of  the  analyses 
given  in  this  thesis  are  discussed.  Possible  applications  to  the 
analysis  of  wind  velocity  data  are  detailed. 
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II.  DISCRETE  RANDOM  COEFFICIENT  MODELS  WITH  LAPLACE  MARGINALS 


A.  INTRODUCTION 

Two  aspects  of  modelling  with  dependent  random  variables  are  widely 
studied — the  marginal  distribution  and  the  correlation  structure.  It  is 
widely  known  how  to  generate  sequences  with  either  a  specified  marginal 
distribution  or  a  particular  correlation  structure.  Transforming  the 
random  variables  may  have  an  undesirable  and  unknown  effect  on  the 
correlation  structure.  Likewise,  the  marginal  distribution  of  a 
filtered  process  may  be  unknown. 

It  is  the  generation  of  random  variables  with  both  a  specified 
marginal  and  a  specified  correlation  structure  that  is  discussed  in  this 
chapter.  Specifically,  we  want  sequences  with  a  Laplace  (double 
Exponential)  marginal  distribution  and  with  ARMA(p,q)  correlation 
structures  as  given  by  Box  and  Jenkins  [Ref.  23]  for  the  usual  linear 
ARMA(p ,q )  models . 

The  following  is  an  example  of  a  process  that  has  Laplace  marginals. 


Let.  {Xn}  be  a  binary  Markov  chain  with  transition  matrix  P,  so  that 


PCXn  =  0  !  Xn_  1  =0  ]  =  a  1  ,  P[Xn  =  1  |  Xn_  1  =0]  =  1-ar  P[Xn=1  |  Xn_  1  =  1  ]  =  a2>  and 

X 


P[X  =0  I  Xn_  1  =  1  ]  =  1  -(*2  •  bet  Ln  =  (-1)  En  ,  where  {  E^ }  is  an  i.i.d. 
Exponential  sequence.  If  a^=o.^  =  (x,  {L^}  has  a  Laplace  marginal 

distribution.  However,  the  correlation  structure  is  not  that  of  an 


A  R (  1 )  process. 


It  is,  in  fact,  easy  to  see  that  Corr(L  ,L  ,  ) 

n  n  —  u 


\  VV4* 


(1/2) (2o-1) ,  for  k=±1  , ±2 . 


which  is  not  a  pure  geometric  function 


of  k. 

Two  processes  which  produce  an  AR(1)  correlation  structure  and  a 
Laplace  marginal  distribution  are  the  Laplace  Discrete  AR(1),  LDAR(I), 
which  is  an  adaption  of  the  DAR(T)  process  of  Jacobs  and  Lewis  [Ref.  2], 
and  the  linear  process  of  Gastwirth  and  Wolff  [Ref.  13]*  called  the 
LAR(  1 )  process.  The  LDAR(1)  model  produces  an  {X^}  sequence  using  the 
first-order  autoregressive  equation  with  random  coefficients 


X  =  V  X  .  +  ( 1  -V  )  L  , 
n  n  n-1  n  n 


(II. A. 1) 


where  {V  }  is  an  i.i.d.  sequence  of  Bernoulli  random  variables  with 
n 

P(Vn=1)  =  1-P{Vn=0}  =  p;  {L^}  is  an  i.i.d.  sequence  of  Laplace  random 

variables.  The  coefficient  and  innovation  processes  from  time  n  are 

assumed  to  be  independent  of  xn_i »Xn_2 » •  •  ♦  •  This  sequence  produces  runs 

of  constant  value  when  successive  realizations  for  produces  the  value 

1.  When  V  is  zero,  a  new  value  is  selected.  Although  LDAR(1)  is  of 
n 

limited  value  in  general  application  because  of  this  runs  property,  it 
is  significant  in  that  it  is  one  of  the  first  in  a  series  of  more 
general  discrete  random  coefficient  equation  models  for  non-Normal  time 
series,  and  it  produces  a  first-order  autoregressive  Markovian  process 
for  any  specified  marginal  distribution. 

The  L A R (  1  )  model  turns  out  to  be  a  special  case  of  the  more  general 
process  called  the  New  Laplace  A ut or egr ess i ve  Moving  Average  model, 
NLA  RMA ( p  ,  q  )  .  Properties  of  the  LAR(1)  process  are  pointed  out  in  the 


next  section  of  this  chapter,  which  gives  a  characterization  of  the 


Laplace  distribution. 

The  NLARMA(p,q)  model  is  a  very  useful  family  of  time  series  models 
that  are  discrete  random  coefficient  linear  difference  equations.  The 
models  are  extensions  of  the  NEARMA(p,q)  structure  of  Lawrance  and  Lewis 
[Refs.  4,5,6]  to  those  cases  where  the  underlying  marginal  distribution 
is  Laplace  rather  than  Exponential.  The  family  provides  great 
flexibility  to  systems  modelling,  because  of  the  broad  range  of 
correlations  and  different  dependency  structures  which  are  obtainable. 

Section  C  is  an  examination  of  the  second-order  autoregressive  model 
of  the  family,  NLAR(2),  for  p  =  2  and  q  =  0  in  NLARMA(p,q).  Conditions 
for  the  existence  and  uniqueness  of  the  strictly  stationary  NLAR(2) 
model  are  derived  using  results  from  Nicholls  and  Quinn  [Ref.  16]  about 
Random  Coefficient  Autoregressive  models  of  order  k,  RCA(k).  In  a 
proof,  very  similar  to  that  given  by  Lawrance  and  Lewis  for  the  NE  AR(  2) 
model  [Ref.  6],  the  innovation  for  the  NLAR(2)  model  is  derived 
explicitly.  The  innovation  is  shown  to  be  a  convex  combination  of 
scaled  Laplace  variables.  The  correlation  structure  in  the  NLAR( 2) 
model  is  shown  to  satisfy  the  Yule-Walker  type  equations  just  as  do  the 
linear  AR(2)  models.  Aspects  of  directionality  and  time  reversibil ity 
are  also  addressed. 

In  Section  D,  the  first-order  autoregressive  model,  NLAR( 1 )  ,  is 

described.  It  is  a  two-parameter ,  first-order  Markov  process  which  is  a 

special  case  of  the  NLAR(2)  model.  The  distribution  of  differences  is 

derived.  The  conditional  density  of  X  given  X  ,  and  the  likelihood 

J  n  °  n- 1 


function  are  also  derived.  The  non-differentiability  of  the  likelihood 
function  for  all  values  of  the  two  parameters  has  prevented  the 
development  of  the  maximum  likelihood  estimators.  Parameter  estimation 
is  discussed  within  the  context  of  moment  estimators  and  least  squares, 
using  the  usual  linearized  residual. 

In  Section  E,  several  different  special  cases  of  NLARMA(p,q)  are 
formulated  and  briefly  discussed.  The  correlation  structure  for  a 
first-order  moving  average  model,  NLMA(  1 )  ,  and  a  mixed  autoregr  essi  ve 
moving  average  model,  NLARMA(1,1)  are  given.  Correlation  structure  is 
derived  and  parameter  estimation  is  discussed  for  the  general  pth-order 
autoregressive  models,  TLAR(p),  which  are  special  cases  of  the  NLAR(p). 

Each  of  these  models  in  Section  E  could  well  be  the  basis  for 
further  research.  The  intent  at  this  point  is  primarily  to  further 
substantiate  the  claim  of  wide  versatility  and  tractability  in  modelling 
non-Normal  time  serie3  within  the  context  of  the  NLARMA (p ,q )  family. 

For  example,  the  bivariate  AR(  1 )  process  with  Exponential  marginal 
distributions  of  Dewald  and  Lewis  [Ref.  24],  can  be  extended  to  the  case 
where  the  marginal  distribution  is  Laplace.  This,  however,  is  not 
discussed  further  in  this  thesis. 

B.  CHARACTERIZATION  OF  THE  LAPLACE  DISTRIBUTION 
1 .  Properties  of  the  Laplace  Distribution 

The  Laplace  distribution  is  also  known  as  the  double  Exponential 
distribution.  In  general,  the  density  of  a  Laplace  distributed 
variable,  L,  has  two  parameters — a  location  parameter  -®  <  y  <  «■«,  and  a 


scale  parameter  A  >  0. 


The  parameter  p  is  fixed  here  at  zero.  For 


-«D  <  x  <  ®  we  have 

f^(x ; A )  -  exp(-  |x(/X).  (II. B. 1.1) 

In  what  follows  we  will  define  {Ln}  as  a  sequence  of  i.i.d. 
random  variables  of  the  Laplace  distribution  with  A  »  1  (Standard 
Laplace).  The  characteristic  function  of  the  standard  Laplace  variable 
is 


<t>L(u>) 


1 

1  +  u>2  ’ 


-»  <  to  <  ®, 


(II. B. 1.2) 


and  we  have 


E(Ln) 


0  if  n  is  odd, 

n!  if  n  is  even, 


(II.  B.  1.3) 


so  that  E(L)  *  0,  Var(L)  =  2,  skewness  is  zero,  and  kurtosis  is  3.  The 
value  of  the  kurtosis  indicates  that  the  symmetric  Laplace  distribution 
has  heavier  tails  than  the  normal  distribution,  for  which  the  kurtosis 
is  0. 

The  sum  of  n  2  2  i.i.d.  standard  Laplace  variables  can  be 
written  os  the  difference  of  two  i.i.d.  random  variables  Y  ^  ,  Y^  with 


Gamma  distribution,  shape  parameter  k  =  n  and  scale  parameter  X  =  1. 


This  follows  immediately  from  the  characteristic  function.  Let 
n 

Y  =  £  L. ;  then 

i-1  1 


( to ) 


1  +  W 


1  +  ioi 


1  ~  iw 


<j>  (w)  ♦  (-w).  (II.  B.  1.4) 

1  2 


This  result  is  quickly  generalized.  Replacing  n  by  t  >  0,  we  see  that 
[^(u)]'"  is  the  characteristic  function  for  the  variable  X  *  Y1  -  1 
where  Y  ^  -  Gamma(t,1),  i  *  1,2  and  Y^  and  Y^  are  independent.  This 
demonstrates  that  the  Laplace  distribution  is  infinitely  divisible. 

Another  useful  result  is  obtained  from  (II. B.  1.4)  when  n  =  1. 
It  shows  that  a  Laplace  variable  is  the  difference  of  two  i.i.d. 
exponential  (A  =  1)  variables.  This  makes  it  quite  simple  to  generate 
Laplace  distributed  variates  in  computer  simulations. 

Random  variables  with  a  standard  Laplace  distribution  are  self- 
decomposable.  Let 


4>  (uj)  =  4>  (uj)/4>t  (poo),  0  £  p  <  1.  (II. B. 1.5) 

According  to  Feller  [Ref.  25:  p.  588],  if  <|>  (w)  is  the  transform  of  a 
random  variable  for  each  0  S  p  <  1,  then  L  is  said  to  be  self- 
decomposable.  But  for  -®  <  uj  <  ® 

<f>£(o))  =  {  1  +  (pw)2}(1  +  U)2)  1  , 

=  [p  ♦  (1  -  p)(1  -  iu>)~ 1  }  { p  +  (1  -  p)(1  +  iu))'1}  (II. B. 1.6) 
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p2  +  (1  -  P2)  (1  +  Ui2) 


(II. B. 1.7) 


We  recognize  (II.  8.  1.6)  as  the  product  of  the  characteristic  functions 
of  two  i.i.d.  innovation  variables,  e  and  a3  described  in  the 

EAR(1)  process  in  [Ref.  1],  Also  from  (II. B. 1.7) 


0  w.p.  p2, 

L  w.p.  1  -  p2. 


(II. B. 1.8) 


2.  The  Laplace  First-Order  Autoregressive  Process,  LAR(1) 

The  i.i.d.  sequence  with  distribution  given  in  (II.B.1.8) 
is  the  innovation  process  of  a  first-order  linear  autoregressive 
equation 


X  =  pX  .  +  e  i 
n  K  n-1  n 


(II. B. 2.1) 


where  {X^}  is  a  stationary  time  series  with  double  exponential  marginal 
distribution,  | p [ < 1 .  This  is  the  LAR(1)  model.  It  is  actually  a 
rediscovery  in  light  of  the  fact  that  Gastwirth  and  Wolff  [Ref.  13]  had 
derived  it  earlier;  also,  Gastwirth  and  Rubin  [Ref.  14]  discuss  it 
within  the  context  of  robust  estimation  techniques.  The  present  account 
of  LAR( 1 )  includes  new  results. 

The  LAR( 1 )  model  has  the  same  properties  as  the  EAR( 1 )  model  in 
[Ref.  1]  with  two  important  differences.  First,  if  -1  <  p  <  0,  negative 
serial  correlations  for  odd  lags  are  obtained.  Secondly,  it  is 


partially  time  reversible  in  the  sense  that  for  all  l  and  n,  both  of  the 
following  are  true: 


E(X?W  ■  E(VSU>  ■ 


P(X  2  X  ,  )  -  P(X  n  , )  -  1  12. 
n  n-1  n  n-1 


(II.B.2.2) 

(II.B.2.3) 


These  results  are  derived  in  Section  II. C  and  Section  II. D.  Note, 
however,  that  since  LAR(  1 )  is  a  linear  A R ( 1 )  model  with  non-Gaussian 
innovation  {e^},  it  is  not  fully  time  reversible  (Weiss  [Ref.  26]). 
Also,  note  that  this  LAR(1)  model  has  the  zero  defect  property;  when 
=  0,  then  X^/X^  i  =  P  and  P  can  be  determined  exactly  in  long  enough 
runs  of  the  series  ( X  } .  This  property  is  generally  undesirable,  but 
the  broader  NLAR( 2)  model  developed  in  the  next  section  is  free  of  this 
defect,  except  for  the  special  parameter  values  for  which  it  reduces  to 
the  LAR( 1 )  model . 

If  no  repeats  are  observed  in  a  realization  of  the  time  series, 

an  extremely  efficient  estimator  of  p  for  LAR(1)  is  the  median  of  the 

ratio  X./X.  The  simulation  results  given  in  Table  II. B. 2.1 

li-l 

substantiate  this  claim.  In  Section  II. D. 4  and  again  in  III.E.5,  using 
the  framework  of  the  Simulation  Testbed  (SIMTBED)  [Ref.  15],  we  will  see 

that  this  median  ratio  is  for  small  samples  very  biased,  and  is, 

apparently,  asymptotically  biased  in  all  of  the  random  coefficient  ARC  1 ) 
models  with  a  Laplace  marginal  distribution  that  we  examine. 
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TABLE  II. B. 2.1 


Simulation  Results  using  Median  {XVX.^}  to  Estimate 
p  in  the  LAR(1)  Process  for  Samples  of  Size  2000 


True  p 

p  «  med  {Xi/Xi_1 } 

Comments 

-.9 

-.9 

-.9  occurred  1586  times 
in  1999  ratios 

-.2 

-.2 

-.2  occurred  75  times 
in  1999  ratios 

-.1 

-.  08746 

-.1  occurred  11  times 
in  1999  ratios 

+  .01 

+  .01986 

+.01  never  occurred  in 
1999  ratios 

+  .5 

+  .5 

+.5  occurred  490  times 
in  1999  ratios 

+  .75 

+  .75 

+.75  occurred  1 1 49 

times  in  1999  ratios 

C.  A  SECOND  ORDER  AUTOREGRESSIVE  LAPLACE  TIME  SERIES  MODEL,  NLAR(  2) 

1 .  Introduction 

Using  the  terminology  from  [Ref.  6]  the  following  time  series 
model  called  NLAR(  2)  ,  New  Laplace  Second-order  Autoregressive  model  is 
proposed.  This  is  a  special  case  of  NLARMA(p,q)  model  with  p  =  2, 
q  =  0.  The  NLAR(2)  model  has  four  parameters,  double  exponential 
marginal  distribution  for  (X^},  second-order  autoregressive  Markov 
dependence,  and  autocorrelations  satisfying  Yule-Walker  type  equations. 


The  stationary  NLAR(2)  model  has  the  same  form  as  the  stationary 
NEAR(2)  model  in  [Ref.  6].  Writing  the  time  series  {X^}  in  the  form  of 
an  additive,  linear,  random  coefficient  autoregressive  difference 
equation,  we  have  for  all  n  that 

X  -  6-  K*  X  +  B  K"  X  +  e  ,  (II.C.1.1) 

n  1  n  n-i  2  n  n-2  n 

where  {K^,  K'^}  is  a  sequence  of  i.i.d.  discrete  bivariate  random 
variables  with  distribution 

w  .p  .  ci^ , 

w.p.  a-,  n  =  0,  ±1,  ±2,  ...  ; 

w.p.  1-  a. -Op*  (II. C. 1.2) 

{e^}  is  an  i.i.d.  innovation  sequence  whose  distribution  is  given  in 
(II.C.2.4);  and  { }  and  { ,  K^}  are  mutually  independent  and 

independent  of  xn_i*xn_2 .  The  parameter  space  is  defined  by 

0  S  | B. |  £  1  and  0  i  a.  i  1,  i  =  1,2;  a1  +  £  1.  Graphs  of  the 

admissible  regions  in  the  parameter  space  and  the  correlation  space  are 
presented  in  Section  II. C. 3- 

Equations  (II.C.1.1)  and  (II. C. 1.2)  have  a  direct  physical 
interpretation.  The  observed  process  at  time  n,  Xn,  is  only  one  of 
three  possibilities:  i)  X^  is  some  multiple  of  what  it  was  at  time  n-1, 
61Xn_1  ,  plus  some  random  noise  en;  ii)  X^  is  some  multiple  (possibly 
different  than  8^),  of  its  value  at  time  n-2,  B^X^  ^ ,  plus  some 


(1,0) 

{K^,  Kj^}  =  (0,1) 

(0,0) 
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independent  random  noise;  iii)  Xn  is  just  random  noise,  e  ,  independent 
of  everything  up  to  time  n. 

2.  Existence  and  Uniqueness 

The  work  of  Nicholls  and  Quinn  [Ref.  16]  on  random  coefficient 
autoregressive  models  is  relevant  to  the  NLAR(2)  process.  They  have 
given  the  necessary  and  sufficient  conditions  for  the  existence  of  the 
uni que  covariance  stationary  solution  to  the  following  class  of 
univariate  random  coefficient  autoregressive  models  of  order  p,  RCA(p): 


Z 

n 


l  Ilf, 


i-1 


Bn(i)}Zn  « 

n  n-i 


+  £ 


(II. C. 2. 1 ) 


n=»0,  ±1,  +  2,  ...,  where 

a.  the  Y.  's  a~e  real  constants; 

l 

b.  { }  is  a  p-vector ,  second-order  stationary,  independent  process 

with  E(B  )  =  0  and  constant  covariance  matrix; 

-n 

c.  (e^l  is  a  scalar,  second-order  stationary,  independent  process, 
independent  of  { } ,  with  E(e2)  =*  o2  for  all  n. 

They  also  have  shown  that  if  { }  and  {e^}  are  i.i.d.  processes, 

then  the  solution  {Z^}  is  strictly  stationary  and  ergodic. 

Let  Y.  =«  a. 8.  for  i  »  1,2  and  B  (1)  =  8,(K'  -  a,)  and  B  (2)  = 
iii  n  Ini  n 

8~(K"  -  a_) .  Then  (II. C. 1.1)  and  (II. C. 2.1)  have  the  same  form.  That 
2  n  2 

is  (II.C.1.1)  is  an  RCA(2)  model  if  the  innovation  of  NLAR(2)  satisfies 


condition  (iii)  above.  Thus  applying  the  results  in  [Ref.  16:  p . 31  and 
p . 37 ] ,  there  exists  a  unique  strictly  stationary  and  ergodic  solution  to 


roots  of  the  characteristic  equation 


(t2  -  a^t  -  a262)(t2  -  a^)  “  0, 


(II.C.2.2) 


are  within  the  unit  circle,  i.e.  iff  +  a2^2  <  1 •  This  is  satisfied 

for  the  conditions  on  the  parameters  defining  NLAR(2),  thus  establishing 

the  existence  of  the  model  (II. C.  1.1). 

No  marginal  distribution  is  ascribed  to  solutions  of  the  general 

RCA(p)  models  in  [Ref.  16].  It  is,  in  fact,  determined  by  the 

independent  choices  of  the  innovation  and  the  random  coefficients. 

However,  by  specifying  the  marginal  distribution  and  the  random 

coefficients,  in  NLAR(  2)  the  innovation  is  restricted  more  than  in  the 

RCA(p)  model.  If  the  X  in  (II. C.  1.1)  or  Z  in  (II. C. 2.1)  have  a 

n  n 

standard  Laplace  marginal  distribution,  then  all  their  moments  are  given 
by  (II. B.  1.3).  From  (II. C.  1.1)  or  (II. C. 2.1),  it  follows  that  for  all 

P  =  1,2,... 

E(efk)  =  {(2k)!}  [l  -  (a.8.2k  ♦  a,B.2k) 
n  11  2  2 


k-1 

I 

i  =  1 


{ ( a1  8 


2(k-i)  2(k~i).,  2( k-i ) 

1  a2e2  )EUn 


)/{(2i )!} \ ; 


( II. c 

>  o. 


2.2) 


end  for  this  to  be  true  it  is  necessary  that 


+ 


o,8 


2k 

O 


<  1. 


(II.C.2.3) 


Since  o  and  a 2  are  probabilities  it  is  necessary  that  |  g^  |  £  1  for 
i  -  1,2  for  (II.C.2.3)  to  hold.  If  not,  there  exists  for  every  >  0 
and  >  0  an  integer  m,  such  that  o^g!^  or  c^g^11  8reater  than  1. 

We  have  now  established  the  necessary  conditions  on  the 
innovation  {e^},  and  on  81  and  g^ — namely  that  IbJ  £  1,  i  -  1,2 — for 
the  existence  of  a  unique  strictly  stationary  solution  to  (II. C. 2.1) 
with  a  marginal  Laplace  distribution  and  with  the  random  coefficients 
given  by  (II. C. 1.2).  In  the  next  theorem,  we  show  that  | &  |  £  1  for 
i  =  1,2  is  also  a  sufficient  condition  and  that  such  an  innovation 
random  variable  exists.  We  also  give  its  explicit  form--a  convex 
combination  of  Laplace  random  variables.  For  simplicity,  the  parameter 
space  is  regarded  as  being  described  by  strict  inequalities  for 
THEOREM  1.  Let  ( }  be  a  stationary  process  with  standard  Laplace 
marginal  distribution.  For  all  n,  let  equations  (II. C. 1.1)  and 
(II. C. 1.2)  hold  with  0  <  |  g.  |  <  1,  0  <  a.  <  1  for  i  =  1,2  and 

<  1 •  Then 


e  =  K  L 
n  n  n 


L 

n 

w.p . 

i-p 

b2! 

lL 

1  n 

w.p . 

p2. 

b3i 

lL 

1  n 

w.p . 

P3’ 

(II.C.2.4) 


where  {L  }  are  i.i.d.  standard  Laplace  variates;  the  K  's  have  values  in 
n  n 

{1,  I  b  _  |  ,  I  b  I  }  and  are  independent  of  {L  )  and  { K  ’  ,  K"}  for  all  n. 
1  2 1  1  3  n  n  n 

They  are  also  independent  of  X  . ,X  _,....  Furthermore, 


P2  -  {(a^;  +  a2B2)b2  "  (a1  +  a2)ei  B2}/(b2  '  b3)(1  "  b2}  »  (H-C.2.5) 
P3  -  {(^  +  a2)  3f  q  ~  (c^B2  +  °282)b3}/(b|  "  b3)(1  "  b3}  *  (H.C.2.6) 

1  >  b2  =  ^{s  +  (s2  -  4r)1/2}  >  b2  =  i{3  -  (s2  -  4r)1/2}  >  0,  (II.C.2.7) 

3  =  0-  a^B2  +  (1  -  a2)g2,  and  (II. r. 2.8) 

r  =  (1  -  0l  -  ct2)B*B|.  ( II. C  .2.9) 

Proof : 

For  the  NLAR(2)  model  specified  by  (II.C.1.1),  (II.C.1.2)  and 
(II.C.2.4)  -  (II.C.2.9),  let  4>„  (  uj  )  and  4>  (oj)  be  the  characteristic 

A  £ 

functions  of  the  {X  }  and  {e  }  sequences.  If  {X  }  is  stationary,  then 

n  n  n 

<^(w)  =  4>  ( ui )  la^  <|>^  (  6  ^  oj)  +  a  2$^  (  8  2o) )  +  (1-a^-a2)}.  (II. C. 2. 10) 

Assuming  a  standard  Laplace  marginal  distribution  for  [X  },  the 
independent  distribution  of  {e^}  has  a  characteristic  function,  possibly 
improper,  given  by 


<t>  ( u> )  = 


(1+B2u>2)  (1+B|u2) 


( 1  +w2 )  [  ( 1  -a1  -a2)  B2  B2u)“  +  {(l-a^B2 


( 1 -a2) 8|}w2 


-  1]* 

( II.C.2. 1 1 ) 
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It  is  convenient  to  factor  the  quadratic  in  cu2  in  the 


denominator  of  (II. C. 2. 11).  The  roots  of  this  factor  are  both  real  and 
distinct.  To  see  this,  note  that 


{(1-^)8*  +  ( 1  -ct2)  2  -  4(  1-c^  ~a2)  62e| 

=  {(l-ct^B*  -  ( 1  -a^)  8  ^  J  2  +  a2B1262  >  °* 


The  roots  are  also  both  negative,  which  can  be  seen  by  noting  that  the 
product  r^2  =  1  /  ( 1  -c^  a  2 )  8  2  6  2  is  positive,  but  the  sum  r^  +  r 2 

=  -{(1-a^B2  +  (  1-a2)  8^ /( _a2')  8l  S2  iS  neSative* 

Letting  r  1  =  -1 /b2  and  r 2  =  -1  /  b 2 ,  we  can  rewrite  (II. C. 2. 11) 
using  partial  fraction  decomposition  to  obtain 


<t>£  ( U)  )  =  (1 


'P2‘P3,lWT 


'2V1  +b  2u>2  • 


3  1  +b2w 


(II. C. 2. 12) 


From  (II. C. 2. 11) 


b2  +  b2  =  (1-o^) 8*  +  (1-a2)B|  =  s 


and 


(II. C. 2. 13) 


b2b2  =  ( 1-a1 -a2) S28|  =  r. 


( II. C .2. 14) 


Comparing  (II. C. 2. 12)  and  (II. C. 2. 11)  term  for  term  also  yields 


p„(  1-b2)  +  p_,(  1-b2) 


a,  8 2 , 


+  a„B 


2 


( II.C .2. 1 5) 


Expressions  for  b*,  b*,  Pg  and  are  obtained  in  terms  of  , 
a2>  61  and  82»  by  solving  (II. C. 2. 13)  -  (II. C. 2. 16).  From  solving 
(II. C. 2. 15)  and  (II. C. 2.16)  simultaneously,  we  obtain  (II.C.2.5)  and 
(II.C.2.6).  Equations  for  b*  and  b*  given  in  (II.C.2.7)  are  obtained 
from  solving  (II. C. 2. 13)  and  (II. C. 2. 14)  simultaneously.  Arbitrarily, 
let  b|  be  the  larger  value. 

It  remains  now  to  show  that  the  inversion  of  (II. C. 2. 12)  will, 
in  fact,  yield  a  function  that  is  a  probability  density  and  is  the 
mixture  of  densities  for  scaled  Laplace  variables.  To  do  this,  we  show 
that  p2  and  p^  can  be  interpreted  as  probabilities  and  that  p2  +  p^  <  1. 

To  establish  that  p2  +  p^  <  1,  we  have,  after  adding  (II.C.2.5) 
and  (II.C.2.6) 


(c^  B*+a282)  -  (c^ +a2)  B*  Bp 
P2  +  P3  =  (1-b|)  ( 1-b|) 


(II. C. 2.  17) 


Multiplying  out  (1-b|)(1-b*)  and  using  (II. C. 2. 13)  and  (II. C. 2. 14),  we 
have,  after  some  r earrangement 


(1-B*) (1-B|) 

(1-B*)(1-B|)  +  c^B'd-Bp  ♦  o2B*(1-B*)  * 


( II. C. 2. 18) 


This  expression  is  clearly  positive  and  less  than  one,  from 


which  it  follows  that  p  +p_<1. 


To  show  that  p2  and  are  probabilities,  it  remains  to  show 
that  they  are  both  positive.  To  do  this,  it  is  shown  that  the 
numerators  and  the  denominators  of  (II.C.2.5)  and  (II.C.2.6)  are 
positive.  For  the  denominators ,  this  requires  that  0<b|  ,  b*  <1,  which 
is  shown  by  noting  0<(1-b|)  (1-b*)<1 .  From  (II. C. 2. 17)  and  (II.  C.  2. 18). 
it  follows  that 


(i-b|) (i-b|) 


(1-Bf)(1-B|)  +  a^Jd-Bf)  +  a26^(1-8p  >  0. 


Also , 


(1-b|)(1-b|) 


(b  1  +  b«)  -  b|b* 
(1-^)8*  +  (1-a2)B| 


( 1-o1 -a2) 8* B| 


=  (l-o^B^d-Bp  +  (1-a2)B|(1-B|)  +  8*6*  >  0. 


Therefore,  b*  and  b*  are  less  than  one,  so  p2  and  p^  have 


positive  denominators. 


To  see  that  p2  and  p^  have  positive  numerators,  note  that  it 


must  be  true  that 


(a  +a  ) 8*  8* 

b3  <  b  =  (^  B"*+a2B*)  <  b2  * 


(II. C. 2. 19) 


U3ing  (II.C.2.8)  and  (II.C.2.9),  (II. C. 2. 19)  is  equivalent  to 


or 


or 


-(s2-4r)1'/^  <  2b  -  s  <  (s2-4r)1/^, 
s2  -  4r  >  (s-2b)2, 

sb  -  b2  -  r  >  0. 


But  the  lefthand  side  of  (II. C. 2. 20)  is 

ot1ct2S1  6|(  6i  ”sp  2 
( a1 B2+a2B2 ) 2 


(II. C. 2. 20) 


which  is  strictly  positive. 

Therefore,  p2  and  p^  are  both  positive  and  p2+p^<1.  Therefore, 
P2,  p^  and  1-p2“p^  can  be  regarded  as  probabilities.  Therefore  en  has  a 
proper  density  which  can  be  generated  as  the  mixture  of  three  Laplaces 
with  scale  parameters  1,  | b 2 1  and  |b^|,  respectively.  Q.E.D. 

The  general  NLAR(2)  model  uses  the  four  parameters  to  achieve  a 
wide  range  of  sample  path  behavior.  Figure  II. C. 2.1  illustrates  four 
different  realizations  of  the  NLAR(2)  process.  In  each  case,  the 
theoretical  autocorrelations  are  identical  with  p(1)  =  0.64  and 
p ( 2)  =  0.5.  Also,  note  that  each  sample  path  was  generated  from  the 
same  i.i.d.  standard  Laplace  sequence  {L^},  such  that  (X^,X2)  =  (L^,L  ). 
Since  this  is  not  the  steady  state  bivariate  distribution  of  (X  ,X  ,|  )  , 
the  sample  paths  illustrated  in  Figure  II. C. 2.1  are  displayed  beginning 
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with  Xc-,  to  avoid  the  initial  transient  behavior  of  the  process.  The 

true  value  of  each  parameter  is  displayed  above  the  corresponding  sample 

path.  Figure  II.C.2.2  contains  the  scatter  plots  for  each  sample  in 

Figure  II.C.2. 1.  The  sample  size  in  each  plot  is  2000. 

Many  special  cases  of  the  NLAR(2)  model  could  be  mentioned.  The 

following  have  one  or  more  of  the  parameters  at  their  boundary  value  and 

have  valid  but  less  complicated  results  for  the  distribution  of  {e^}  in 

(II.C.2. 4).  If  a1  -  aig  =  0,  then  is  the  i.i.d.  sequence  {Ln}  and 

X  »  e  .  If  a  *  1  then  {e  }  is  the  innovation  of  the  LAR(l)  model 
n  n  1  n 

derived  from  (II. B. 1.7)  and  (II. B.  1.8).  If  { B 1  j  =  |  B  J  =  1  and 
a1  +  a2  <  ^  then  each  is  distributed  as  a  scaled  Laplace  random 

variable,  /  l-o^-c^  L^.  These  models  are  called  the  TLAR(2)  models, 

which  are  easily  extendable  to  higher-order  autoregressions,  as 

discussed  in  Section  II. E.  If  a,  <1  and  =  0  or  80  =  0,  then  {e  }  is 

12  2  n 

the  innovation  of  the  new  first-order  autoregressive  model  NLAR(1). 
This  model  is  the  subject  of  Section  II. D. 

3.  Autocorrelation  Structure 

In  this  section,  it  is  shown  that  the  aut  ocor  r  el  at  i  ons 

p(H)  =  Corr(Xn,  ,  l  =  0,±1,±2,...  of  the  NLAR(2)  model  satisfy  the 

Yule-Walker  type  difference  equations;  thus  the  second  moment  dependency 

aspects  are  indistinguishable  in  form  from  those  for  the  AR(2)  process. 

We  also  compare  the  admissible  regions  of  an  AR(2)  with  (i)  an  NLAR(2) 

with  4  parameters  and  (ii)  an  NLAR(2)  with  only  two  parameters. 

From  the  independence  of  {K  }  and  {K',  K"},  and  (II. C. 1.1), 

n  n  n 

(II. C. 1.2)  and  (II.C.2. 4),  we  see  that  E(K’)  =  a,,  E ( K " )  =  a 0  and 

n  1  n  2 
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SCATTER  PLOTS;  p(  1)  =  .64  AND  p( 2) 


Figure  II.C.2.2.  NLAR(2) :  Scatter  Plots;  p(l)=.64  and  p(2)= 


E(en)  -  E( Kn ) E( )  -  0.  Multiplying  (II. C. 1.1)  on  both  sides  by  Xn_^  we 
have  for  t  J  1.  ECX^.,)  -  «,  6,  EU^X^)  *  °2S2E(X„-2Xn-  t ) . 
Dividing  by  Var(Xn)  we  have  p(-l)  -  a^pd  -  1)  +  a2&2p(!  -  2),  since 
p(-Jl)  -  pU) .  Substituting  a.  0.  =  a  for  i  -  1,2  and  p(0)  -  1  ,  we  have 

p( 1 )  =  at  +  a2p( 1 ) 

p( 2)  -  a  p(1)  +  a2>  (II. C. 3-D 

which  are  the  same  equations  as  those  which  occur  for  the  AR(2)  process. 

Since  |  B^  |  SI  for  i  *  1,2  and  +  <*2  S  1  in  NLAR(2),  the  usual 
triangular  admissible  region  for  AR(2)  given  in  Box  and  Jenkins 
[Ref.  23:  p.  61  ]  shrinks  to  the  interior  of  a  diamond-shaped  area  in 
(a1  =  a2  =  a282)  coordinates:  |  a  ^  |  +  |  a  2 1  <  1.  (See  Figures 

II. C. 3* la  and  1b).  In  (p(1),  p(2))  coordinates  the  equation 
p(1)2  =  (1  +  p(2))/2  defining  allowable  combinations  of  p(1)  and  p(2)  in 
A R(  2)  also  changes.  For  NLAR(2),  the  space  in  (p(l),  p(2))  coordinates 
becomes  a  triangular  region  bounded  below  by  (  p ( 1 )  [  =  ^-{  1  +  p (  2) } .  (See 
Figures  II. C. 3- 2a  and  2b). 

The  reduction  in  allowable  parameter  or  correlation  combinations 
for  NL A R ( 2)  over  the  AR(2)  model  is  not  large.  This  encouraged  us  to 
consider  a  2-parameter  NLAR(2)  model  by  specifying  ci^  =  0? ,  for  i  =  1,2, 
so  that  a^  =  3?.  The  parameter  space  in  (a  ,a2)  coordinates  becomes  the 

o  /2 

symmetric  region  bounded  by  the  curves  B|  =  ±  (  1  -  Bp  (see  Figure 
II.  C.  3-  1c).  In  (8j,  B2)  coordinates  the  admissible  region  of  the  two 
parameter  model  is  bounded  by  the  unit  circle  B*  +  B|  =  1  •  Using  only 
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Boundary  of 
for  Linear  . 


ADMISSIBLE  REGION  FOR  p(1)  AND  p( 2) 

b:  NLAR(2)  MODEL  WITH  4  PARAMETERS 


Figure  II.C.3.2.  Point  plots  of  Admissible  Region  for  p(l)  and  p(2)  for 
Linear  AR(2)  and  NLAR(2)  Processes 


two  parameters  leads  to  the  admissible  region  in  Figure  II. C. 3. 2c  for 
(p(1),  p ( 2) )  space.  The  (p(1),  p(2))  space  was  obtained  by  transforming 


the  lines  ^  5  a2  *  0|  ^  c  S  1,  in  Figure  II. C. 3. 1c  to  p(2)  = 

( 1  -a2 )  p  ( 1 )  2+a2,  where  |  p  ( 1 )  |  £  apO-ap  =  BpO-Bp  and  &2  =  (1-Bp372 
if  ap  0  and  B|  =  -(1-Bp372  if  a2  <  0. 

All  the  plots  in  Fig.  II. C. 3*1  were  generated  from  a  grid  of 
equally  spaced  values  of  a^  and  a2*  In  Fig.  II. C. 3. la  the  points 
satisfy  the  Yule-Walker  equations  (5.1).  In  Figs.  II. C. 3- 1b  and  1c,  the 
points  also  satisfy  the  conditions  of  Theorem  1.  In  Fig.  II. C. 3*2  the 
feasible  combinations  of  p(1)  and  p(2)  are  plotted  for  those  values  of 
a1  and  a2  from  Fig.  II. C. 3.1  using  the  Yule-Walker  equations  (5.1). 

4.  Directional  Moments  and  Partial  Time  Reversibility 

In  the  last  section,  we  demonstrated  that  the  second  moment 

dependency  aspects  of  the  NLAR(2)  model  were  indistinguishable  in  form 

from  those  of  the  ordinary  AR(2)  model.  Also,  it  is  well  known  that  if 

the  linear  autoregressive  model  is  not  Gaussian,  then  the  process  is  not 

completely  determined  by  the  first  and  second  moments.  Thus  in  model 

identification  it  becomes  necessary  to  examine  third  order  moments  to 

further  identify  the  process.  Special  third  order  moments  E(X2  X  „), 

n  n+2, 

for  all  l,  are  known  as  directional  moments.  If  the  directional  moments 
for  all  5,  are  equal,  which  is  necessary  for  a  process  to  be  fully  time 
reversible,  we  say  the  process  is  partially  time  reversible  in  the  sense 
of  directional  moments. 

A  process  is  fully  time  reversible  (Lawrance  [Ref.  27])  if  the 

joint  distribution  of  X  ,  X  ,,...,  X  ,  is  the  same  as  that  for  X  , 

n  n+1  n+r  n+r 

....X^  for  all  r  and  for  all  n.  Since  LAR(1),  a  special  case  of 
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X 


0-1  » 


NLAR(  2)  ,  Is  not  fully  time  reversible,  NLAR(2)  is  in  general  not  time 


reversible . 

In  this  section  we  show  by  induction  arguments  that  all  the 
third  order  moments  of  NLAR(2)  are  the  same  as  those  for  Gaussian  AR(2) 
model;  i.e.  E(X.X  X  )  =  0  for  i,  j,  k.  This  implies  particularly  that 

1  J  K 

the  directional  moments  of  NLAR(2)  are  equal  and  therefore  that  NLAR(2) 
is  always  partially  time  reversible. 

In  Section  II. B,  we  found  that  E(X2)  =  0  for  all  i  since  X ^  is 
marginally  Standard  Laplace.  It  is  easy  to  establish  the  following  two 
equations: 

E(X  X2  .)  =  0oa_E(X2X  .);  (II. C. 4.1) 

n  n-1  22  n  n-1 

E(X2X  )  =  {(82a0)/(1  -26  B  ao,))  E(X  X2  ).  (II.C.4.2) 

n  n-1  22  1212  n  n-1 

Solving  (II. C. 4.1)  and  (II.C.4.2),  simultaneously  yields  E(X^X2_1)  = 

E(X2X  )  -  0. 
n  n-1 

Now,  usim1,  separate  induction  arguments  and  the  stationarity 

assumption,  we  establish  that  E(X  X2  „)  =  0  for  all  2.  £  1  ,  and 

n  n- i 

E(X2X  ,  )  =  0  for  all  k  >  1 . 
n  n-k 

The  proof  of  E(X  X2  .)  =  0  is  straightforward, 
n  n- £ 

To  prove  E(X2X  .  )  =  0,  we  first  show  that  the  expectation  of 
n  n-k  K 

special  third  order  moments  of  the  form  X  X  ,X  .  for  k  5  2  is  zero. 

n  n-1  n-k 

Define  p,  =  E(X  X  ,X  ,  )  and  assume  E(X2X  .  )  =  0,  j  S  k  -  1.  From 
k  n  n-1  n-k  n  n-j  J 

( II. C. 1.1)  , 


w 


a  -  E(X_X_  ,X_  ,.)  -  a,  8,  E(X*X_  ,,, _ ,  ^  )  +  a0B0E(X„X„_1X^_  ,  J 


n  n-1  n-k  “l P1 n  n- (k-1 ) '  “2P2^  n  n-1  n-(k-1 ) 


a262yk-1 


(a2S2^k  1p1 


(II.C.4.3) 


Now  from  (II.C.-4.1)  and  (II.C.4.2),  we  have 


U,  =  E(X  X2  , )  -  a_S~E(X2X  ,)  =  0.  Therefore  =  0. 
1  n  n-1  22  n  n-1  k 


We  now  proceed  to  show  that  E(X.X.X  )  =  0  for  all  i,  j,  k. 

1  J  K 

Without  loss  of  generality  let  i  <  j  <  k  so  that  k=i+n,j=i+m 

and  n  >  m.  Therefore  by  stationarity  E(X.X.X. )  =  E(X.X.  X.  )  = 

ljk  ii+mi+n 

E (X . X .  /  ,X.  ).  Fixing  m  so  that  0  <  m  <  n  we  use  induction  on  n. 

l  i-(n-m)  x-n 

Let  n  =  2,  implying  m  =  1.  The  first  step  in  the  induction  follows  from 

E  ( X i X . _ 1 X ^ _ 2 )  =  u2  =  0.  Next  assume  that  for  m  <  n  £  K, 

E  ( X  X .  ,  .X.  )  =  0.  Now  we  show  that  E(X.X.  .  ,X.  )  =  0. 

i  l-(n-m)  l-n  l  i-(K+1-m)  i-(K+1) 

Using  (II. C. 1.1),  we  write 


E(XiXi-(K+1-m)Xi-(K+1 ) }  “  a1 S1 E(Xi-1Xi-(K+1  -m)Xi-(K+1 ) } 

+  a2e2E(Xi-2Xi-(K+1-m)Xi-(K+1 ) 
+  E(eiXi-(K+1-m)Xi-(K+1))* 


) 


Now  E(  eixi- (k  +  1  -m)Xi-(K  +  1  )  }  =  E  (  e  i }  E  (  X  i  -  ( K  + 1  -m  )X  i  -  ( K  + 1  )  }  0  and 
E(Xi.1X._(K+1_[n)Xi_(K+1))  -  E(XiXi_(K.in)X1.K)  =  0  by  stationarity  and 

the  assumption.  Likewise  E(Xi_2Xi_^[<,  +  1_m^X._^K  +  1^)  = 

E(X .  X .  r/I,  ,,  ,  X .  =  0.  This  completes  the  induction, 

l  i-{ ( K— 1 ) — m }  l — (K— 1 ) 

An  immediate  result  from  the  argument  about  third  moments  is 


that  Z 


X 


X  for  (X  }  of  the  NLAR(2)  is  not  skewed. 


The  residual  analysis  in  [Ref.  6]  and  [Ref.  22]  using  cross 


correlations  between  linear  autoregressive  residuals  R  =  X  a,X  ,  - 

n  n  1  n- 1 

a„X  and  their  squares  R2,  does  not  shed  any  new  light  on  the 

2  n  -  2  n 

directionality/reversibility  in  the  NLAR(2)  model  or  help  in  identifying 

the  appropriateness  of  the  Laplacian  model.  This  is  because  all  third 

moments  have  zero  expectation.  Thus,  we  see  that  E ( R 2  R  „)  = 

n  n  +  a 

E(R  R2  „)  =  0  for  all  l. 
n  n+i, 

Note  that  the  basis  for  the  residual  analysis  using  the  { } 

process  is  that  this  process  is  uncorrelated  but  not  necessarily 

independent.  The  moment  results  show  that  the  Rn's  have  zero  skewness. 

In  fact,  it  is  easy  to  show  that  the  distribution  of  R  is  the  same  as 

the  distribution  of  -R  .  Thus  the  R  '  s  are  symmetric  although  they 

n  n  J 

will,  of  course,  not  have  Laplacian  distributions. 

In  Chapter  IV  of  this  thesis,  a  residual  analysis  based  on 
certain  fourth-order  moments  is  presented. 

D.  THE  NEW  LAPLACE  FIRST-ORDER  AUTOREGRESSIVE  MODEL,  NLAR( 1 ) 

1 .  Introduction 

The  new  Laplace  first-order  autoregressive  model  is  another 
special  case  of  the  NLARMA(p,q)  model  when  q=0  and  p=1  .  This  is,  of 
course,  a  special  case  of  the  NLAR(2)  model,  where  either  a2  and/or  82 
are  zero  in  (II. C. 1.1).  Examples  of  the  different  sample  path  behavior 
obtainable  from  the  NLAR(1)  Process  are  given  in  Figure  II. D. 1.1.  Note 
that  each  sample  has  the  same  value  of  lag-1  serial  correlation,  i.e. 
p(1)  =  Corr (X  ,Xn_ 1 ) .  In  Figure  II. D. 1.2  are  the  corresponding  scatter 
plots  for  the  samples  in  Figure  II. D.  1.1.  In  the  scatter  plot  labeled, 


a  =1",  the  distinctive  regression  line,  x 


px  is  clearly  visible 


NLAR(  1 ):  SAMPLE  PATHS;  p(l) 


Figure  II. D. 1.1.  NLAR(l) :  Sample  Paths;  p(l)=.64 


for  the  LAR( 1 )  process.  This  is  produced  as  explained  in  Section  II. B, 


because  the  innovation,  e  ,  can  be  zero  with  non-zero  probability. 

The  t wo- par amet er  autoregressive  model  generates  an  {X^} 
sequence  which  satisfies 


K' 8,X  ,  +  e  , 

n  1  n-1  n' 


(II. D. 1.1) 


where 


K' 

n 


and 


1  w.p.  o, 

0  w.p.  1-a. 


K'B.X  +  K  L 
n  1  n-1  n  n 


w.p.  i-p. 


/  1 "ai I ei I Ln  w-p*  P2 


(II. D. 1 .2) 


P2  =  a1 8^/(1  *  ( 1-a1 ) B* } 


(II. D. 1.3) 


Also,  {K^},  t  } ,  { L^}  are  i.i.d.  sequences  independent  of  each  other 

and  independent  of  X  ,,  X  . . 

n-1  n-2 

From  (II. D. 1.2)  and  (II.D.1.3)»  we  see  that  the  inversion  of  the 

_ i /2  —  1 

characteristic  function  for  e  ,  letting  X  =  (1-a^)  (IbJ)  ,  gives 
for  0<a1<1 


f  (x) 
e 

n 


( 1“P2) 
2 


(II.  D.  1 .10 


which  is  a  convex  mixture  of  Laplace  densities.  This  result  also 
follows  directly  from  Section  II. C. 3.  since  the  NLAR(1)  model  is  an 


NLAR(2)  model.  Likewise,  the  correlation  structure  and  partial  time 
reversibility  in  the  sense  of  directional  moments  are  the  corresponding 
results  for  the  NLAR(2)  model  with  a2=0  or  62»0.  That  is 

Corr(Xn,Xn_k)  =  (ag)lkl  for  all  k  -  0,  ±1,  ±2,  ...  (II. D. 1.5) 

and 

E(X2X  ,  )  =  E(X  X2,  )  =  0  for  all  k  =  0,  ±1,  ±2.  (II. D. 1.6) 
n  n+k  n  n+k 

We  can  rewrite  (II. D. 1.1)  as 


.  J-1 


i  =  e  +  l  8^(  ir  K'  ,)e  . 

n  n  jfl  1  i=0 


(II. D. 1.7) 


From  (II.D.1.1),  it  is  clear  that  X  depends  only  on  X  ,  and  e  .  From 

n  y  n- 1  n 

(II.D.1.7),  we  see  that  X  ,  is  independent  of  e  ,  for  all  k£0.  Hence 

n-1  n+k 

{ Xn >  is  a  first-order  Markov  process  and  starting  XQ  with  a  standard 

Laplace  distribution  makes  {X^}  stationary. 

The  remainder  of  this  section  is  devoted  to  specific  results  for 

the  NLAR(1)  process  which  have  not  been  shown  in  the  more  general 

NLAR( 2)  model.  The  extension  of  these  results  to  the  NLAR(2)  process 

would  require  the  joint  distribution  of  { Xn  ,X  .Xn_2 } ,  which  has  not 

been  derived.  The  conditional  density  of  X  given  X  ,  is  derived,  as 

n  n-1 

well  as  an  expression  for  the  joint  distribution  of  the  X  .  The 

n 

distribution  for  the  differences  Z  =X  -X  ,  is  also  derived.  Parameter 

n  n  n-1 

estimation  is  discussed  in  the  context  of  moment  estimators  and  least 
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squares  using  the  linearized  residual.  The  problems  with  finding  the 

maximum  likelihood  estimators  of  and  8  are  also  addressed. 

2.  Conditional  Density  and  the  Joint  Density  of  (X  , ...,X  ) 

To  find  the  conditional  density  of  X  ,  given  X  ..we  use 

1  n  n-  1 

(II. D. 1.1)  -  (II. D.  1.4)  to  evaluate  P(X  <x  |X  .).  We  have  for  a,<1, 

n  n 1  n- 1  1 

which  eliminates  the  LAR(1)  process, 


P(X  <x  X  .  )  =  P(K* 8  X  +  e  <x  X  . ) 
n  n  n-1  n  1  n-1  n  n  n-1 


a  P(e  <x  -8.x  .)  +  (l-a.)P(e  <x  ) 

1  n  n  1  n- 1  1  n  n 


x  -8, x 
n  1  n-1 


°1  f 


n 

f  (x)dx  +  ( 1 -a  )  (  f  (x)dx. 
e  lie 

n  1  n 

-fi-00 


(II. D. 2.1 ) 


Differentiating  (II. D.  2.1)  with  respect  to  x^  yields  the  following 
expression  for  o^O, 

fX  |X  ,<XJX„-,>  •  “lfE  (V6tVl>  *  (,'“l)rE  (XnK  UI.D.2.2) 

n 1  n- 1  n  n 

Examples  of  (II.D.2.2)  for  a  fixed  x  .  and  fixed  Y  =  a. 8,  =  .64  are 

n- 1  11 

given  in  Figure  II. D. 2.1. 

Now  we  can  write  the  joint  density  fY  Y  (x  ,  x  .)  as  the 

a  \  .  n  n  i 

n  n- 1 

product  fv  iv  (x  lx  , )fv  (x  ,).  In  fact,  the  n-di mens i onal 
X  X  ,  n 1  n- 1  X  ,  n- 1 
n 1  n- 1  n- 1 

distribution  of  X  ,...,X  is  obtained  using  this  product  recursively  to 
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CONDITIONAL  DENSITIES  IN  THE  NLAR(1)  PROCESSES 


Figure  II.  D. 2.1.  Examples  of  Conditional  Density  of  X  |  X  -.in  the  NLAR(l) 


obtain  the  density 


"x  ...X.(V‘*,,X1)  “  fX  jX  Cxn l Xn-1  }  fX  |X  (xn-1  lXn-2' 
n  1  n 1  n-M  n^1  1  n-2 


(II.D.2.4) 


3.  Distribution  of  Differences  and  P(X  ,>X  ) 
_ n^i  n 

We  now  consider  the  distribution  of  the  difference  Z  =  X  -X  , . 

n  n  n-1 

Using  (II. D. 1.1)  *  (II. D. 1.4)  and  the  fact  that  is  a  convex  mixture 

of  Laplacian  random  variables,  we  used  partial  fraction  decomposition  to 

invert  the  characteristic  function  of  Z  to  obtain  the  following 

n 

expression  for  the  density: 


fz  (y)  =  exp{-|y|/(1-B1 )} 


a1  (1-6, ) 

2 

■ 

P2  0‘P2) 

{ ( 1-B1 )  2-“a2 }  -  B1  ('2-:B1 ) 


exp(- |y | /a) ( op?/2) 


a. 


( 1-a 


a2  "  ( 1*B1 )  2  l^o2 


+  -exp(-|y | ) 


(1-‘a1)p2  (1“a1  )  (1‘iP2)  a1(1^p2) 


1-o" 


B1(2-B1 ) 


+  (1-p  )(1*a  ) |y  |exp(-|y  |  )/4, 


(II.  D.  3.1) 


where  o2  =  ( 1~a. ) B? . 


One  immediate  result  is  that  f^  (y)  is  symmetric  about  zero  and 

n 

therefore,  P(Z  <0)  =  P(Z  >0)  -  1/2.  This  demonstrates  one  additional 
n  n 

feature  of  the  partial  time  reversibility  of  the  NLAR(1)  models;  i.e., 

probabilities  of  a  run  down  (X  >X  ,  )  and  a  run  up  (X  <X  )  are  the 
r  n  n“  1  n  n~  1 

same.  To  evaluate  probabilities  of  higher  order  runs  would  require  the 
joint  distribution  of  the  sequence  {Z^}.  This  result  has  not  been 
obtained  for  the  NLAR(2)  model. 

4.  Estimation  of  Serial  Correlation 
a.  Introduction 

The  purpose  of  this  section  is  to  present  estimators  of  the 

two  parameters  and  f}^  whose  product  is  the  correlation  coefficient  in 

the  NLAR( 1 )  models.  We  assume  throughout  this  section,  unless  otherwise 

stated,  that  (X^l  has  a  standard  Laplace  (u=0,  A  =  1)  marginal 

distribution.  Estimation  of  \i  and  A  for  models  that  have  marginal 

Laplace  distributions  are  discussed  in  Chapter  III.  We  also  only 

consider  the  random  coefficient  models  of  the  NLAR(1)  process,  i.e.  ct<1  , 

thus  eliminating  the  LAR( 1 )  model.  As  was  shown  in  the  introduction  to 

this  chapter,  for  0^=1,  81  can  be  estimated  very  efficiently,  thus 

eliminating  the  need  for  further  discussion. 

The  method  of  moments  is  used  first  to  find  an  estimator  of 

y  =  The  joint  moment  estimators  of  and  81  are  calculated  from 

fourth^order  moments.  These  estimators  are  used  later  in  an  iterative 

procedure  to  obtain  the  joint  least  squares  estimators  of  and  8^ . 

A  least  squares  estimation  procedure  is  defined  for  the 

NLA  R  (  1  )  models  using  the  usual  linear  residual  R  =  X  “"a.B.X  . 

n  n  1  1  n- 1 


a 


The  expectation  of  fourth^order  moments  can  be  calculated  using 


(II. D. 1.1)  and  the  fact  that  {X  }  is  a  stationary  process.  For  example 


(II.D.4.3) 


E(XiXi-1}  “  12aiSi{1  +  (2-^)0*}  , 

E(xixi-i}  “  4(i+5a1ep  , 

EU.X^)  =  24a131  , 

ECXjX^X^)  -  4o101t1+2a1012+3a1(2-o1)0j}. 


(II.D.4.4) 

(II.D.4.5) 

(II.D.4.6) 


Solving  for  and  B^  in  different  pairs  of  these 
equations  gives  the  estimators  based  on  f ourth^order  moments.  It  is  to 
our  advantage  to  use  the  expressions  with  the  lower  order  moments  where 
possible.  Therefore,  using  ECX^X?^)  and  E(X^Xj^)  instead  of 
E(XiXi-i )  •  we  solve  for  the  following  expressions  for  the  joint  moment 
estimators  of  and  B 


Ja  X‘X“’ 


( n— 1 ) 


I  x?x*  .  -  4(n-1 ) 
i=2 


(II. D. 4.7) 


B 


1 


n 

l  (x?x?  )  -  4 ( n- 1  ) 

„  1 


10 


n 

l 

i  =  2 


x ,  x . 
i  i- 


(II.D.4.8) 


From  the  scatter  plot  analyses  in  Figures  II. D. 4.1  and 
II.D.4.2,  we  see  an  example  of  the  behavior  of  this  pair  of  estimators 
when  a1  =  B1  =  .8  in  the  NLAR(1)  model.  Both  scatter  plots  contain  500 


pairs  (a  ,6  )  derived  first  from  samples  of  size  250  and  then  from 


SCATTER  PLOT  TABLE 


Scatter  Plot  Analysis  of  Joint  Moment  Estimators  of  (a^,^)  in  the  NLAR(l) 
Process  for  500  Samples  of  Size  250  with  a, =g  =.8 


samples  of  size  2500.  It  is  clear  from  the  equations  (II. D. 4.7)  and 

A  A  A 

(II.D.4.8)  that  Y  -  a1®1-  The  hyperbola  can  he  seen  in  both  scatter 
plots.  Both  parts  are  visible  for  sample  size  of  250.  However,  for 
pairs  derived  from  samples  of  size  2500,  only  the  part  in  the  first 
quadrant  is  visible. 

From  the  Normal  probability  plots  in  Figure  II.D.4.3, 

A  A  A 

there  is  little  evidence  of  non-Normality  for  Y  =  a  g  for  N  =  250,  and 
less  for  the  estimator  derived  from  samples  of  size  2500.  However, 

A  A 

individual  estimators  and  B1  look  far  less  Normal  for  both  sets  of 
sample  sizes. 


c.  Least  Squares  Estimation  in  the  NLAR(1)  Process 

(1)  The  Linear  Residual.  The  properties  of  the  linear 
residual  are  developed  for  use  in  deriving  the  least  squares  estimators 


of  Y  =  a1B1  and  for  and  B]  jointly.  We  begin  by  rewriting  (II. D. 1.1) 
in  the  RCA(1)  form  as  given  in  (II. C. 2.1).  We  have 


X  -  a.B.X  .  +  B.CK'-aJX  .  ♦  e  . 
n  1  1  n^ 1  1  n  1  n* 1  n 


(II.D.4.9) 


From  this  expression,  there  are  clearly  two  ways  to  write  down  the 
linear  residual,  R^.  The  usual  one  from  linear  theory  is,  of  course 


R  =  X  -a.B.X  , 
n  n  1  1  n~l 


(II. D. 4. 10) 


NLAR(l) 


However,  a  particularly  useful  way  of  looking  at  it  is  from 


Rn  -  ,]WV  (II. D. 4. 11) 

It  is  from  (II. D. 4. 11)  that  we  see  explicitly  how  the  i.i.d.  innovation, 

{e  },  and  the  coefficient  {K'^a,}  processes  impact  on  the  linear 
n  n  i 

residual . 

Let  Tp'  be  the  a^algebra  generated  by  [  {  (  K  ’ -a  .  )  ,  e.  }; 
n-|  k  l  K 

k-1 , .  .  .  ,n-1  ]  .  Intuitively,  ,  represents  all  the  information 

about  the  process  up  to  time  1.  Conditioning  on  >  we  have  the 

following  two  useful  properties  of  as  noted  by  Nicholls  and  Quinn 
[Ref.  16:  p.  42] . 

E(B„i 

E<^l 

These  results  follow  because  X  ,  is  a  function  only  of 

n-i 

the  process  through  (n-1)  and  (K^*a^ )  and  en  are  both  independent  of  it. 

(2)  The  Least  Squares  Estimator  of  T  =  a^.  Using  Rn  from 

(II.  D.  4. 10)  and  a  given  sample  from  { X  }  ,  we  obtain  the  least  squares 

n 

estimator  by  minimizing  the  sum  \  R?  with  respect  to  the  product  a.B. 

i  =  2  1 

which  is  now  called  Y .  We  have 
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Y 


(II.D.4.15) 


which,  in  fact,  is  the  usual  expression  for  the  estimation  of  serial 
correlation  in  linear  AR(1)  models  as  given,  for  example,  in  Chatfield 
[Ref.  31  :  P.  66]. 

Since  the  NLAR(1)  process  is  an  RCA(1)  process  of 
Nicholls  and  Quinn,  it  follows  from  their  theorem  [Ref.  16:  p.  44]  that 
Y  is  strongly  consistent,  asymptotically  unbiased  and  — \  --( y*Y)  has  an 

/IT 

asymptotic  Normal  distribution.  The  asymptotic  variance,  from  the  same 
results  of  Nicholls  and  Quinn,  is 


ai  =  1  +  50^8*  6(01^  B1  ) 2.  (II. D. 4. 16) 

Figures  11.0.4.4^11.0.4.7  contain  the  boxplot  analysis 
of  SIMTBED  [Ref.  15]  output  for  selected  choices  of  and  81  in  the 
simulation  of  the  least  squares  estimator  of  the  product  in  the 

NLAR( 1 )  processes.  Note  that  although  the  estimated  asymptotic  mean  is 
the  true  value,  Y  -  =  .64,  for  each  of  the  four  sets  of  the 

parameters,  the  estimated  asymptotic  variance  of  the  estimator  of 
a 1 6 1  -  Y  is  different  for  each  of  the  four  different  sets  of 

parameters.  The  simulation  results  reflect  the  asymptotic  theoretical 
results  for  the  NLAR(1)  processes  as  given  above. 
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An  analogous  result  is  given  in  Section  III.E.4,  where 

the  theory  of  least  squares  is  derived  for  the  Beta^Laplace  AR(1)  model. 

(3)  The  Joint  Least  Squares  Estimation  for  and  61  .  It 

n 

is  not  possible  to  minimize  £  R2  with  respect  to  a.  and  g  individ^ 

i-2  1  1 

ually.  However,  a  technique  from  Nicholls  and  Quinn  [Ref.  16:  p.  43], 
which  uses  the  result  in  (II. D. 4. 13)  is  applicable.  As  was  pointed  out 
earlier  in  Section  II. C. 2,  by  assuming  nothing  about  the  particular 
marginal  distribution,  Nicholls  and  Quinn  were  free  to  treat  the 
variances,  o2  and  ,  as  completely  independent  parameters  subject  only 

£  I\ 

to  the  constraint  that  the  marginal  distribution  of  {X^},  whatever  it 

is,  has  a  positive  variance.  Then,  given  (II. D. 4. 13),  it  was  possible 

n 

to  estimate  a2  and  a2,  by  minimizing  the  sum  of  squares  £  S2  where 


S  - 
n 


o2  X2 
X'  n- 


(II.D.4.17) 


and  R2  =■  (X  ■‘TX  .)*  and  Y  is  from  (II.  D.  4.  15).  They  derive  the 
n  n  n- 1 

properties  of  the  trivariate  distribution  of  the  estimator  of 
(T,  o2,  a2,). 

Since  a2  and  o2,  are  related  parametri cally  in  and 
8 1  ,  the  results  in  [Ref.  16]  concerning  the  variances  do  not  apply  in 
the  NLAR(1)  process.  However,  we  can  form  from  (II. D. 4. 13)  and 
(II.D.4.10)  an  analogous  expression  for 


73 


2(1-a1812), 


(II.D.4.18) 


where  the  product  ot ^  6 ^  in  R^  is  not  replaced  by  Y  from  (II. D.  4. 15). 

In  terms  of  a  sample  from  { Xp } ,  we  define  the  joint 

A 

least  squares  estimators  of  and  0  to  be  those  values  and  0^  that 
minimize 


n 

l  {(Xj-a^x  )2  -  a1(1-a1)02x2-1  -  ) }  2  ,  (II. D. 4. 19) 
i-2 

where  (II.D.4.19)  is  the  sum  of  the  squares  of  given  in  (II.D.4.18). 
Now  it  is  clear  that  (II.D.4.19)  is  a  highly  nonlinear  expression  in  two 
unknowns,  a  and  g^ .  A  given  numerical  technique  could  converge  to  a 
local  extremum,  a  saddle  point,  or  diverge  depending  on,  among  other 
things,  the  starting  values  for  estimating  and  0^. 

Constraining  the  nonlinear  optimization  problem  given 
by  (II.D.4.19)  to  the  rectangle  within  which  the  NLA R ( 1 )  process  is 
defined**0  £  £  1  and  *1  £  g^  £  1 -““eliminates  the  divergence  problem, 
but  clouds  the  estimation  issue  regarding  the  boundary  models  LAR(1)  and 
TLAR(1).  We  try  an  unconstrained  approach  described  below. 

(4)  An  Unconstrained  Nonlinear  Optimization  of  (II.D.4.19). 
It  is  easy,  but  tedious,  to  write  the  normal  equations  from  (II.D.4.19). 
One  critical  point  is  at  =  g^  =  0.  After  factoring  a ^  from  the  one 
equation  and  g1  from  the  second,  several  iterations  of  the  Newton- 
Raphson  method  (see,  for  example,  Gerald  [Ref.  28:  pp.  122*128])  can  be 
performed  to  find  other  critical  points.  The  Newton-Raphson  method  uses 
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a  second-order  Taylor  series  approximation  to  solve  the  non-linear 
system  by  a  set  of  linear  Jacobian  equations.  However,  one  needs  to 
calculate  the  four  second  partial  derivatives  from  (II. D. 4. 19)  and  to 
have  a  good  starting  point  on  the  surface. 

The  IMSL  routine  ZSPOW  solves  systems  of  non-linear 
equations  for  one  root  using  modified  Newton  methods.  This  routine  was 

A  A 

used  to  solve  the  unconstrained  problem  of  finding  ot  and  B  from  sets 

of  data  from  simulated  NLAR(1)  processes.  The  routine  was  very  senstive 

to  starting  values  and  did  not  always  converge  even  when  the  sample  size 

was  as  large  as  2500.  It  also  did  not  perform  well  when  the  true 

correlation  coefficient,  Y  -  was  small  for  any  of  the  simulated 

I  k  I 

NLAR(  1  )  processes  with  the  same  autocorrelation  function,  Y1  1 .  This 
problem  is  highlighted  by  the  fact  that  (II.D.4. 19)  is  constant  along 
the  line  -  0  and  the  line  B1  =  0. 

As  an  illustration  of  the  performance  of  the  routine, 
500  sets  of  sample  sizes  250  and  2500,  respectively,  were  generated  from 
the  NLAR(1)  process  with  =■  8^  -  .8.  The  scatter  plot  analyses  in 

A  A 

Figures  II.D.4.8  and  II.D.4.9  show  how  the  estimators  a1  and  61 
determined  by  ZSPOW  are  related.  Especially  for  the  samples  of  size 
250,  there  is  the  same  pattern  of  the  hyperbola  as  seen  in  the  moment 
estimators  of  and  B1  given  in  Section  1 1 .  D  .  4  .  b . ( 2  )  .  From  the 
accompanying  tables,  it  is  clear  that  the  variance  of  the  marginal 
distributions  for  each  estimator  a  and  B1  is  decreasing  with  increased 
sample  size.  The  Normal  plots  of  the  empirical  marginal  cumulative 
distribution  functions  for  and  for  B1  appear  very  non-Normal  even 


500  Samples  of  Size  250  with 


Scatter  Plot  Analysis  of  Joint  Least  Squares  Estimators  of  (01^,8^)  in  the  NLAR(l) 
Process  for  500  Samples  of  Size  2500  with  a  =B_ =.8 


from  estimators  derived  from  samples  of  2500.  On  the  other  hand,  the 

A  A  A 

Normal  plots  of  Y  =  indicate  that  the  distribution  is  converging  to 

a  Normal  distribution  as  required  by  theoretical  results  of  the 
previous  subsection.  (See  Figure  II. D. 4. 10). 

It  is  convenient,  at  this  point,  to  summarize  the 
results  on  the  moment  and  least  squares  estimation  of  Y  =  and 

(a^B^)  in  the  NLAR(1)  processes. 

In  the  estimation  of  Y,  only  second^order  product 
moments  are  required  for  both  methods.  From  the  Normal  probability 
plots  in  Figures  II.D.M.3  and  II. D. if.  10,  it  appears  that  both  estimators 
of  Y  are  converging  to  Normal  distributions.  Although  the  moment 
estimator  of  Y  is  unbiased  (the  least  squares  estimator  is 
asymptotically  unbiased),  the  variance  of  the  moment  estimator  of  Y  is 
considerably  larger  than  that  of  the  least  squares  estimator  of  Y. 

The  estimation  of  a1  and  B1  requires  fourth-border 
product  moments  for  both  methods.  The  variance  of  the  moment  estimators 
of  a1  and  81  are  too  large,  even  for  samples  of  size  2500  to  be  useful 
in  distinguishing  between  NLAR(1)  processes.  The  least  squares 
estimators  of  and  B1  have  smaller  variances  than  the  corresponding 
moment  estimators  and  could  be  useful  in  distinguishing  between  NLAR(l) 
processes.  However,  as  pointed  out  above,  the  numerical  routine  to  find 
the  critical  points  does  not  always  converge  for  a  given  starting  value 
of  a1  and  8 ^  .  The  conclusion  is  that  neither  method  of  estimating  a 
and  6  is  very  satisfactory. 
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(5)  The  Median  (X^/X^)  Estimator  of  Y  *  a^.  The  median 
of  (X./X^)  was  seen  to  be  extremely  efficient  in  the  LAR(1)  process. 
It  also  makes  sense  in  the  context  of  maximum  likelihood  estimation  in 
LAR( 1 ) .  This  is  discussed  in  the  next  section. 

Simulation  results  confirm  the  conjecture  that  the 
median  (X^/X^  )  is  not  a  robust  estimator  of  Y  for  departures  from  the 
LAR(l)  process.  In  fact,  from  the  boxplots  in  Figures  II. D. 4. 11  - 
II. D. 4. 14  of  SIMTBED  output  for  four  NLAR(1)  processes,  the  estimators 
seem  to  become  more  biased  as  8  approaches  one — corresponding  to  the 
other  boundary  process,  TLAR( 1 ) .  Even  for  the  small  size  of  the 
simulations,  the  standard  deviation  of  the  mean  is  small.  For  the  three 
non-LAR(l)  models,  the  asymptotic  estimates  of  the  mean  of  Y  given  in 
the  data  are  each  significantly  different  from  the  theoretical  value  of 
Y  =  .64. 

d.  Method  of  Maximum  Likelihood 

(1)  Introduct  i  on .  The  logarithm  of  the  likelihood 
function,  L(a^,6^),  is  obtained  by  taking  the  natural  logarithm  of  the 
n-dimensional  joint  density  given  in  (II.D.2.4)  and  treating  it  as  a 
function  of  a1  and  81  for  a  given  realization  of  length  n  from  {X^}.  We 


80 


i 

i 


i 


kV 


0OM  CO 

o*-o>j  © 

rOCW  ON 
0*0  •  • 
■<JO 


O  <VCT4fN 

i  jnr*o 

UU MO  OCC 


PW  — 

o  u... 

I  Urf/i  ..  «/5 

WCQ-X  UJU»(/N  h- 

ococ^iTNeo  oar-  z 

(nOKMONBO  UWZ  UJ 

o*m»/>©©  — — 

N0OKO  •  .  IO-  O 

•  •  oo  — o  — 

OOO  t  </>u»—  u. 

ULkU.  U. 

Dww  uj 
<OU!  O 
KUO  u 
CNJ  UJ  o 

o  >  • 


UJp-O, 

®**0»fNO 

OffONO 

s0©JT\0*“ 


-J  ZtflV) 

ft.  <tA— 

X  UlUJhO 

<  xzo 

trtUJ  X  JH 
flDN i  <OOws 
3—  uj*-^-sc© 
</)(/)  X (/></)</> X 


I 

X  UJ 

zoz  o 

0—0  X 

i/>—  < 

X (/></)  — 

Ourt/")  cc 

— xuj  < 
COOfiC  > 
fcOUJO 
UJOCUJ  x 

X  X  o 

OU. 

UJOU.  x 

X  o  o 

UJ  — 
U-0>  (/> 

OXU  1S> 
<0  UJ 
X—  X 
<XO  O 

UJ<*-  UJ 

X></>  « 


<u 

0> 

•H 

Cn 


II. D. 4. 14.  SIMTBED  Boxplot  Analysis  of  Median  Estimator  of  ^=01.^  with 

y=.64  in  the  TLAR(l)  Process 


have 


n 

L(a1 ,  B1 )  =  -n(i.n2)  -  |x1  |  +  l  JlnCc^  ( 1-p2)exp{-|x. -B^.^  | } 


+  (1-a  ) (1-p2)exp{-|xi | }  +  a1P2Aexp{-A|xi-61xi_1 | } 


+  (1-a1 )p2Aexp{-A|xi | }],  (II. D. 4.20) 

1 

where  p2  was  given  in  (II.D.13)  and  A  =  /(l-ct^B*  . 

Maximizing  (II.D.4.20)  in  the  general  NLAR(I)  model  is 
not  accomplished  here  for  two  reasons.  First,  L(<x1,f}^)  is  not 
differentiable  with  respect  to  B1  at  any  of  the  n  values  B1  =  x./x.^ 
for  i  =  1,...,n,  because  of  the  terms  |x.-B1xi_1|.  A  bivariate  search 
routine  that  does  not  use  derivatives  is  needed. 

Second,  L(a^  ,  6^  )  is  not  defined  along  the  line  ot1  =  1 
at  any  of  0  £  k  £  n  values  of  B1  such  that  -1  <  B1  =  xVx.^  <  1  •  To 
see  this,  examine  the  third  term  of  the  natural  logarithm  in 
(II.D.4.20).  We  have  replacing  A  for  all  i  =  2,...,n 


'(1-^)0* 


/( 1  -a1 ) 6* 


(II. D. 4.21 ) 


Because  of  the  presence  of  the  exponential  term  in  (II.D.4.21),  the 
limit  as  a  approaches  one  is  zero,  so  long  as  81  *  x^/x^_1 .  The  limit 
does  not  exist  on  the  set  B  =  {8^|B1  =  xj/xi_i>  1  =  2,...,n}. 
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1 ,  corresponding  to 


It  is  worth  noting  that  for  a1  * 
the  LAR(1)  model,  and  except  on  the  set  B,  (II. D. 4. 20),  can  be  written 
as 


L(  1 ,  B1 )  =  -nUn2)  -  |x1  |  +  (n-l)ln(l-Bj) 


-  I 


n 

i=2 


X.-B..X.  , 

l  1  1-1 


B,  t  B. 


(II. D. 4.22) 


Now  fcn(1-B*)  is  maximized  at  B]  =  0  and  the  optimal  value  for 
n 

\  lx.-B.x.  |  is  the  least  absolute  deviation  (LAD)  estimator  of  B. 
i-2  111-1 

which  is  the  weighted  median  of  )  where  the  weights  are  x^_ ^  for 

i  =  2,...,n.  Thus,  if  after  a  large  number  of  observations  from  {X^}  no 

repeats  of  x./x._1  are  observed,  then  there  will  be  little  difference 

between  a  particular  LAR(  1  )  model  and  the  completely  random  model  of 

i.i.d.  Laplace  variables.  In  this  case,  for  any  B1  in  a  small  deleted 

neighborhood  around  81  =  med(x  /x^  ) ,  (II.D.M.22)  will  be  large  because 

n 

both  in(1-B*)  and  \  |x  will  be  optimized. 

i  =  1 

(2)  The  Maximum  Likelihood  Estimator  of  a  in  the  TLAR(1) 
Processes .  In  this  section,  the  likelihood  function  for  the  TLAR(1) 
process  is  described.  The  maximum  likelihood  estimator  is  found  using  a 
numerical  iteration  scheme.  The  properties  of  the  estimator  are 
investigated  and  compared  to  the  least  squares  estimator  using 


simulation . 


For  the  TLAR(1)  models  (f^  *  1  or  ^  ■  -1),  (II.D.M.20) 
can  be  written  as  a  one-dimensional  function  of  the  a  variable  a.  We 


n  a 

L(a)  =  -n(in2)  -  |x^  |  +  £  in  — — 


1=2  /I -a. 


+  /l-a1  exp 


(II. D. 4. 23) 


where 


x.  -  x.  .  a  £  0, 
li-l 


x.  +  x.  .  a  <  0, 
li-l  ’ 


(II. D. 4. 24) 


-1  <  a  <  1  and  *  |  a|  . 


Now  L(a)  is  continuous  everywhere  in  the  open  interval 

(-1f+1)  and  differentiable  everywhere  except  at  a  =  0.  The  expressions 
dL(a)  d2L(a)  , 

for  — _ -  anci  — _ — _ —  are  lengthy  and  cumbersome  to  use;  hence  are  not 

da  da 


given  here. 


Examples  of  the  likelihood  curve  are  given  in  Figures 


II.D.lJ.15  -  II. D.  4. 18.  Each  curve  was  generated  from  a  sample  of  100 
from  a  simulated  TLAR(  1 )  r""ocess  with  the  stated  a^  and  61 .  It  is  easy 
to  see  the  non-differentiable  point  at  zero  and  how  flat  the  curve  is. 
To  see  that  there  is  a  maximum  (annotated  with  x  on  the  figure)  in  these 
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Figure  II. D. 4. 15.  TLAR(l) :  Log-Likelihood  Function;  a  =.5,  8  =-l  and  SSN=100 


Figure  II. D. 4. 16.  TLAR(l)  :  Log-Likelihood  Function;  a^.l,  and  SSN  100 


Figure  II. D. 4. 17.  TLAR(l) :  Log-Likelihood  Function;  a  =.64,  g  =1  and  SSN=100 


curves,  the  second  part  of  each  figure  focuses  on  the  function  near  the 
true  value  of 

The  IMSL  routine,  ZXLSF,  a  one-dimensional  search 
routine  was  used  to  find  the  value  of  a  that  maximized  (II.D.4.23).  The 
starting  value  a  was  the  least  squares  estimator  of  serial  correlation 
given  by  (II.D. 4. 1 5) . 

Using  500  samples  of  sizes  50  and  500,  respectively, 
from  simulated  TLAR(1)  processes  with  Y  =  .64,  the  scatter  plot  analyses 
in  Figures  II.D. 4. 19  and  II.D. 4. 20  were  completed.  The  least  squares 
estimator  and  maximum  likelihood  estimator  appear  to  be  correlated. 
From  the  accompanying  tables,  the  maximum  likelihood  estimator  appears 
to  have  a  smaller  variance  and  bias  than  the  least  squares  estimator. 
Analysis  of  the  boxplots  from  a  SIMTBED  comparison  of  the  least  squares 
estimator  and  the  maximum  likelihoood  estimator  reflect  the  same  results 
(see  Figures  (II.D. 4.21  -  II.D. 4. 22). 

From  the  Normal  plots  given  in  Figure  II.D.4.23,  both 
the  least  squares  and  the  maximum  likelihood  estimator  appear  to  be 
coverging  to  a  Normal  distribution.  There  are  three  or  four  outliers  in 
the  tail  out  of  500  points. 

E.  OTHER  CASES  OF  THE  NLARMA(p.q)  MODEL 
1 .  Introduction 

A  primary  advantage  of  the  NLARMA(p,q)  model  is  the  ease  with 
which  the  basic  framework  can  be  altered  to  cover  a  variety  of  different 
dependency  structures.  The  NLAR(2)  and  NLAR(1)  processes  have  been 
examined  closely  in  the  previous  sections  of  this  chapter.  At  this 


.19.  Scatter  Plot  Analysis  of  the  Maximum  Likelihood  and  the  Least  Squares 
Estimators  of  y  in  the  TLAR(l)  Process  for  500  Samples  of  Size  50  with 
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Figure  II. D. 4. 20.  Scatter  Plot  Analysis  of  the  Maximum  Likelihood  and  the  Least  Squares 

Estimators  of  y  in  the  TLAR(l)  Process  for  500  Samples  of  Size  500  with 
a,  =  .64  and  3-,  =+l 
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time,  the  moving  average  first-order  model,  NLMA(1),  and  the  mixed 
model,  NLARMA(1,1),  are  briefly  considered.  The  correlation  structure 
and  parameter  space  are  discussed  for  each  model. 

The  TLAR(1)  model  for  which  the  maximum  likelihood  estimation 
was  completed,  can  be  easily  extended.  As  the  final  part  of  this 
section,  we  present  the  p^-order  autoregressive  processes  for  arbitrary 
p  £  2.  The  conditions  for  existence  and  uniqueness,  the  correlation 
structure  and  likelihood  function  are  given.  The  maximum  likelihood 
estimation  scheme  for  the  p  parameters  is  also  discussed. 

2.  A  Backwards  MA(1)  Model,  NLMA( 1 ) 

a.  Correlation  Structure  of  the  NLMA(1)  Process 

From  (II. D.  1.1),  we  see  that  Xn  is  the  random  coefficient 
sum  of  independent  variables  each  of  which  have  a  marginal  Laplace 
distribution.  Therefore,  we  can  replace  X n_ 1  by  another  Laplace 
variable.  If  it  is  independent  of  Ln  and  has  a  standard  Laplace 
marginal  distribution,  then  by  the  construction,  X^  will  still  have  a 
standard  Laplace  marginal  distribution. 

If  we  replace  X^_^  ,  in  fact,  by  L-n_1  in  (II. D. 1.1),  we 
obtain  the  following  expression  for  Xn 

X  =  K'B.L  +  K  L  ,  (II. E. 2.1) 

n  n  1  n-1  n  n 

where  {KM  and  {L  }  are  as  given  in  (II. D. 1.2)  and  {K  }  is  the 
n  n  n 

corresponding  two-valued  discrete  variable  as  given  in  (II.C.2.4)  for 
the  NLAR( 2)  model. 
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Since  Xn_k  is  by  construction  in  (II. E.  2.1)  independent  of 
Xn  for  )k|  2  2,  we  see  that  the  model  has  the  cut  off  property  of  a 
linear  MA(1)  model.  The  maximum  range  of  correlations  in  any  MA(1)  is 
less  than  or  equal  to  1 1 /2 j  ,  (Fuller  [Ref.  29:  p.  62]).  This  range  is 
achieved  by  the  linear  MA(1)  models.  Some  of  the  random  coefficient 
M A ( 1 )  models  have  been  shown  to  have  a  maximum  range  for  the 
Corr (X^ ,Xn-1 )  to  be  strictly  less  than  one-half  (see  Hugus  [Ref.  30]). 

Using  (II. E. 2.1)  recursively,  we  derive  the  serial 
correlation  in  NLMA(1)  as 


corr(Xn,Xn.,) 


Cov(*n,Xn-1 1 

Var(x  ) 

n 


E{(K'B,L  ,+K  L  )X 
n  1  n-1  n  n  n-1 


VlEa„-,Xn-1> 


a1B1E(Kn_i). 


(II.E.2.2) 


Substituting  in  the  values  of  the  i.i.d.  sequence  { }  with  the 
corresponding  probabilities  p^,  1-p2  from  (II. D. 1.3)  we  have 


‘r*V*V*V ♦*  • 

J’*  L*j»  -  V 


1-6*  +  c^Bf/O-a^S* 
i"  -  (i-a1 ) e* 


(II.E.2.3) 


Figure  II. E. 2.1  is  a  contour  plot  of  the  level  curves  for 

p(1)  =  Corr(X  , X  .  ).  Notice  that  in  this  model,  the  correlation  is 
n  n- 1 

restricted  in  range  over  that  of  the  linear  MA(1)  models.  Using  the 
IMSL  global  constrained  optimization  routine,  ZXMWD ,  with  multiple 
starts,  the  extremes  for  lag-1  serial  correlation  are  |p(1)|  S  0.4026, 
occurring  at  =  .903  and  81  =  ±.690.  In  Chapter  III,  we  give  a 
continuous  random  coefficient  model  with  MA(1)  correlation  structure, 
Laplace  marginal  distribution,  and  the  full  range  of  correlations,  i.e. 
| p( 1 )  |  S  5. 

b.  Invertibility  in  NLMA(T) 

It  is  well  known  (Chatfield  [Ref.  31,  p.  43])  that  if 


*„  ■  Zn  •  SlV,’  (II.E.2.H) 

is  a  linear  MA(1)  model,  then  substituting  ( 1  / B ^  )  in  for  does  not 
change  the  autocorrelation  function.  This  implies  that  the  linear  MA(1) 
model  is  not  uniquely  determined  by  its  autocorrelation  function. 

It  is  also  well  known  (Chatfield  [Ref.  31:  p.  43])  that  by 
successive  substitution,  the  MA(1)  model  in  (II.E.2.4)  can  be  written  as 
the  infinite  autoregression 


100 


MLMA(l) :  Contour  Plot  of 


i w.  \  wr."  T* 
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(II.E.2.5) 


Likewise,  if  1/6^  is  in  (II.E.2.4),  we  ha\e 


Z 

n 


X  -  1  X  , 
n  B1  n-1 


+  1  V 

sfn-2 


(II.E.2.6) 


Unfortunately,  only  one  of  the  two  processes  given  by  (II.E.2.5)  and 
(II.E.2.6)  yields  a  convergent  power  series  depending  on  whether 
| B  |  <  1  or  not.  Hence,  the  restriction  on  B1  called  "invertibility"  by 
Box  and  Jenkins  [Ref.  23:  p.  50],  guarantees  a  one-to-one 
correspondence  between  a  linear  MA(1)  model  and  its  autocorrelation 
function  by  restricting  B1  to  be  such  that  the  MA(1)  "inverted"  infinite 
autoregression  is  the  one  with  a  convergent  power  series  representation. 

This  definition  of  invertibility  is  not  totally  applicable 
to  random  coefficient  models  (such  as  NLMA(1))  with  MA(1)  correlation 
structure  because  it  has  not  been  established  that  there  exists  a 
corresponding  infinite  autoregr ession  model. 

Likewise,  there  can  be  an  infinite  number  of  models  that 
have  the  same  autocorrelation  function  and  marginal  distribution.  This 
is  the  case  in  NLMA(1).  As  was  seen  in  Figure  II.E.2.1,  each  contour 
line  corresponds  to  a  constant  value  of  p(1)  and  is  achievable  by  an 
infinite  number  of  combinations  of  (o^,B^). 

The  purpose  of  this  section  then,  is  to  find  a  different, 
but  meaningful,  way  to  restrict  the  (a^B^  rectangle  in  Figure  II.E.2.1 
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which:  (1)  does  not  further  restrict  the  range  of  p ( 1 ) ;  and  (2)  which 

within  the  region  the  NLMA(1)  model  must  be  uniquely  determined  by  p(1) 


and  either  or  8^. 

From  the  contours  in  Figure  II.  E.  2.1,  it  appears  that  the 
feasible  region  for  p(1)  can  be  partitioned  in  such  a  way  that  the  two 
goals  stated  above  can  be  achieved.  It  is  not  known,  however,  if  this 
partition  can  be  described  analytically.  Figure  II.E.2.2  is  an 
illustration  of  the  partition  into  a  center  region  and  two  complementary 
disjoint  regions.  The  center  region  is  roughly  defined  as  the  region  to 
the  right  of  a  line  from  (  —  1  ,  .667)  to  (-.577,1  )  and  to  the  left  of  a 
line  from  (.577,1  )  to  ( 1  , .  667 )  -  Both  lines  cut  across  the  contours  in 
the  depression  on  the  left  and  on  the  ridge  on  the  right.  The  center 
region  is  more  advantageous  for  two  reasons.  First,  p(1)  is  a 
continuous  function  of  and  81  in  the  center  region.  Secondly,  the 
parameter  estimation  is  more  likely  to  be  easier  if  the  most  extreme 
values  of  a1  and  8^  can  be  avoided  simultaneously.  Therefore,  we  shall 
call  the  center  region  of  Figure  II.E.2.2  the  "principal"  region. 

3.  A  Mixed  Autoregressive-Moving  Average  Model,  NLARMA(1,1) 

From  the  theorem  in  Section  II.  C. 2,  we  see  that  any  two 

(possibly  dependent)  Laplace  variables  can  be  combined  with  an 

independent  set  of  (again,  possibly  dependent)  Laplace  variables  to  form 

another  Laplace  variable.  Using  this  property,  if  we  replace  Xn_2  in 

NL A R (  2 )  by  L  ,,  then  the  marginal  distribution  of  (X  }  is  still 
7  n- 1  n 

standard  Laplace.  We  have  then 


a 


NLMA(l) :  Boundary  of  Principal  Region  in  Parameter  Coordinates 


(II. E. 3.1) 


M'x  . 

1  n  n-1 


Wn-I  + 


K  L  , 
n  n 


where  { K ' , K" } ,  (L  },  { K  }  are  as  previously  defined. 

n  n  n  n  y 

Notice  that  if  is  identically  zero,  corresponding  to  =  0, 
we  obtain  an  expression  of  the  form  given  by  (II.E.2.1)  for  NLMA(1). 
Likewise,  if  is  identically  zero,  we  have  the  NLAR(1)  model  as  given 
in  (II. D. 1 . 1)  . 

The  NLARMA(1,1)  model  has  the  same  correlation  structure  as  the 
linear  mixed  model  ARMA(1,1).  Using  (II. E. 3.1). 


E(XnX„-,>  -“,B,E<XnV  *  “262Ean-1Xn-1> 

*  E<xn-1KnLn) ■  (II.E.3.2) 

But  X  . ,  K  and  L  are  independent  so 
n-1  n  n 

E(X  X  ,  )  =  2a.  8 ,  +  a0B~{a161E(Ln  .Xn  9) 
n  n-1  11  2211  n-1  n-2 

+  a,B,E( L  . L  )  +  E(L*  .K  .)}.  (II. E. 3-3) 

2  2  n-1  n-2  n-1  n-1 

Conditioning  on  K  .j  ,  using  the  independence  of  {  }  and 

(X  _,  L  .)  and  dividing  by  the  Var(X  )  we  have 
n-2  n-1  n 

p(1)  =  a1B1  ♦  a2B2(1-p2'P3  +  h2|p2  +  |b3|P3)  .  ( II.  E.  3- 4) 
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where  p2,  p  ,  |  b  2 1  ,  |b^|  are  defined  in  (II.C.2.5)  through  (II.C.2.9). 
For  2.  >  2,  (II.E.3.3)  and  (II. E. 3.4)  become 

E(X  X  )  =  a.B.Ett  X  #)  (II.E.3.5) 

n  n-2,  1  1  n-1  n-2. 

and 

p(2,)  =  ot^S^p(2.— 1).  (II. E. 3.6) 

These  equations  are  the  same  as  those  of  the  ARMA(1,1)  model 
(see  Chatfield  [Ref.  36:  p.  58]).  However,  the  range  of  correlations 
is  significantly  reduced  over  that  of  ARMA(1,1).  Figure  II. E. 3.1 
represents  a  side-by-side  comparison  of  the  (p(1),  p(2))  space  for 
NLARMA (1,1)  and  the  familiar  linear  ARMA(1,1).  Although  p( 1 )  can  range 
from  -1  to  +1,  the  combinations  with  p(2)  are  severely  limited  in 
NLARMA( 1,1).  The  minimum  p(2)  in  NLARMA(1,1),  found  numerically  using 
the  reduced  gradient  method  is  approximately  -.025  at  p(1)  =  ±.2.  As 
j  p ( 1 ) |  increases,  p(2)  approaches  p(1)2. 

4 .  Higher  Order  Autoregressive  Models,  TLAR(p) 
a.  Introduction 

It  has  been  stated  by  Raftery  [Ref.  32]  that  there  exists 
NEAR(p)  models  for  p  £  2.  Also,  Nicholls  and  Quinn  [Ref.  16]  have  given 
conditions  for  the  existence  and  uniqueness,  strict  stationary,  etc., 
for  general  RCA(p)  models.  However,  only  for  the  NEAR(2)  and  the 
N  L  A  R (  2 )  processes  has  it  been  shown  explicitly  what  the  necessary 
innovation  is;  and  that  it  is  a  valid  random  variable. 
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POINT  PLOTS  OF  ADMISSIBLE  REGION  FOR  p(l)  AND  p( 2) 


Figure  II. R. 3.1.  Point  Plots  of  Admissible  Region  for  p ( 1 )  and  p ( 2 )  for  Linear  ARMA(1,1)  and 
NLARMA (1,1)  Processes 


For  p  £  3.  this  has  not  been  accomplished  for  the  general 

NEAR(p)  process;  nor  is  it  done  now  for  the  NLAR(p)  process.  However, 
D  t  h 

there  are  2P  different  p  order  autogres3ive  models  with  p  parameters 
that  are  special  cases  of  the  NLAR(p)  process.  These  models  are  called 
the  TLAR(p)  models.  The  innovation  for  the  second-order  model  was  given 
without  proof  following  the  theorem  in  Section  II. C. 2.  The  likelihood 
function  and  maximum  likelihood  estimation  of  was  given  in  Section 
II. D. 4  for  the  TLAR(1)  processes. 

The  TLAR( 2)  models,  including  the  two  TLAR(1)  models  only 
account  for  four  of  the  infinite  number  of  NLAR(2)  models  which  all  have 
the  same  AR(2)  correlation  structure  and  standard  Laplace  marginal 
distribution.  Since  there  is  a  variety  of  different  sample  path 
behaviors  obtainable  in  the  general  NLAR(2)  model,  it  is  possible  that  a 
TLAR( 2)  model  will  not  always  be  the  most  appropriate  model  for  a  given 
set  of  data. 

However,  as  is  shown  in  the  remainder  of  this  section,  the 
TLAR(p)  models  have  an  advantage  over  the  general  NLAR(p)  models.  The 
TLAR(p)  processes  for  p  £  3  exist;  are  easily  constructed;  are  partially 
time  reversible;  and  are  parsimonious  with  respect  to  parameters.  The 
parameters  in  the  TLAR(p)  process  are  easily  estimated  from  the 
conditional  likelihood  function  by  the  method  of  maximum  likelihood, 
b.  Existence  and  Uniqueness 

The  TLAR(p)  models  p  £  1  have  the  form 


X 


+  e  . 


( 1 1.  E.  4. 1 ) 


where  { }  is  assumed  stationary  with  Laplace  marginal  distribution; 

{K^  , . . .  ,K^P  ^ }  =  Kn  is  a  p-variate  discrete  random  variable  independent 

of  {e  }  and  X  .  ,  X  „,....  For  all  n 
n  n- i  n-2 


(1,  0,  0 .  0)  w.p.  a, 


(0,  1 ,  0,  . . . ,  0)  w.p.  a„ 


(0,  0,  0,  ....  1 )  w.p.  a 

P  P 

(0,  0,  0,...,  0)  w.p.  1  -  Z  a  -  X  >  0,  (II.E.4.2) 

i-1 


so  E(K^)  =  for  all  i  =  1,...,p.  The  2P  choices  of  model  arise  from 

the  selection  of  signs  for  each  of  the  (either  +1  or  -1). 

Now  if  {xn}  is  stationary,  then  the  following  expression  for 

the  characteristic  function  of  the  i.i.d.  innovation,  e  ,  follows  from 

n 

(II.E.4.1)  regardless  of  the  choice  of  signs  on  (The  distribution 

of  a  symmetric  random  variable  Z  is  the  same  as  that  for  -Z) .  We  have, 


l»Y(w)  =  E[exp{ -i<u(  l  <  +  OH, 

a  .  „  n  ri”  x  n 

n  1  =  1 


.(u)  C  I  a.  4>v  (to)  +  A], 


4>  (u>)  [ ( 1  “ X  )  <t>y(u)  ♦  X], 
en  n 


from  conditioning  on  Kn>  the  stationarity  assumption  of  (Xn)  and  the 
independence  of  of  X  ^  Xn_2  , . . . .  Therefore,  substituting  from 
(II.B.1.2) 


>)  =  +  x 


=  1/(1  +Au>2) 


(II.E.4.3) 


For  A  >  0,  (II.E.4.3)  is  recognized  as  the  characteristic  function  of  a 

scaled  Laplace  random  variable  with  scale  parameter  /T~. 

Since  (II.E.4.1)  can  be  written  as 


p  m 

X  -  X  {o,X  ♦  (k;1  -a. )X  }  +  E 
n  .  .  i  n-i  n  l  n-i  n 

i=«1 


(II. E. 4. 4) 


and  satisfies  the  conditions  in  Section  II. C. 2,  the  TLAR(p)  models  are 
RCA(p)  models.  Since  the  innovation  { }  and  {Kn}  are  i.i.d.,  then 
TLAR(p)  are  strictly  stationary  and  {Xn)  is  the  unique  solution  by  the 
theorems  of  Nicholls  and  Quinn  [Ref.  16:  p.  31  and  p.  37]. 
c.  Correlation  Structure 

t  h 

The  TLAR(p)  models  are  p  -order  autoregressive  in  the  sense 
that  E(Xn|Xn_1  =  x1 ,  X^_2  *  x^,...,  Xn_p  =  x  )  is  a  linear  function  in 
x.,  i  =  1,...,p.  It  is  also  autoregressive  in  the  sense  that  it 


I 


satisfies  a  set  of  Yule-Walker  equations.  Multiplying  (II.E.4.1)  by 

X  . ,  l  i  1 ,  and  taking  expectations,  we  have 
n-x, 

E(X  X  )  -  a, E(X  Xn  ,)  +  . . . +a  E(X  X  ).  (II.E.4.5) 

n  n-x,  i  n-1  n-x,  p  n-p  n-x, 

Dividing  by  VarCX^)  and  substituting  l  =  1 . p  into  (II.E.4.5),  we 

have  the  set  of  equations 


p(1)  =  a1  +  a2p(1)  +• • • +app(p-1 ) 
p( 2)  =*  a^O)  +  a2  +...+a  p(p-2) 

p(p)  ”  a1p(p-1)  +  a2p(p-2)  +...+  a  ,  (II.E.4.6) 

where  ai  *  ai(Sign  of  for  all  i  =  1 . p. 

For  the  TLAR( 2)  cases,  the  (p(1),  p ( 2) )  admissible  region  is 

the  entire  diamond  given  in  Figure  II.C.3.1.  It  is  divided,  however, 

into  four  right  triangles,  one  per  quadrant,  corresponding  to  the  sign 

of  X  ,  and  X  _  in  the  model, 
n-l  n-2 

d.  Conditional  Density  of  X  I X  ,  ,X  . . X 

n 1  n-1  n-2  n-p 

The  conditional  density  for  each  of  the  2 13  specific  choices 
of  signs  are  easily  found  noting  that  the  conditional  probability  is 


just  a  sum  of  Laplace  cumulative  distributions.  We  have 


P(X  <  x  X  .  -  x.,  X  .  «  x_,...,  X 
n  1  n-1  1 ’  n-2  2  n-p 


V 


P(K(1)x  +  K(2)x  +  . 

n  1  n  2 


,+K(p)x  +  /ITl  <  x) 

n  p  n 


=  a,P(/r”L  <  x-x, )  +  ...+  a  P(/T~L  <  x-x  )  +  AP(/A~ L  <  x) 
1  n  1  p  n  p  n 


,x-x 


‘  “lFLi 


Lx-x 


/r 


■*■...+  a  F, 


p  w/r 


+  AF. 


x 

/r 


(II.E.4.7) 


where  F^(*)  is  the  cumulative  distribution  function  of  a  standard 
Laplace  random  variable.  Taking  derivatives  with  respect  to  x,  we  have 


X  |X  . X 

n1  n-1  n-p 


1  F 

(x|x.,...,x  )  =  —^2  I  a!  f  r  I 
1  P  /A  i  =  1  1  M 


x-x. 


/Tf,  p- 

Ll/r 


(II.E.H.8) 


e.  Conditional  Maximum  Likelihood  Estimation  of  (a^...^  ) 

Since  there  are  many  p-variate  Laplace  distributions  that 

(Xp . X i  )  could  have,  and  that  the  particular  one  is  not  known  to  us, 

it  is  not  possible  to  form  the  exact  likelihood  function  which  is 
written 


f 


X 


n 


.X 


1 


n 

IT 

i=p  +  1 


( II. E. 4.9) 


Instead,  we  can  calculate  the  conditional  log-likelihood 
function  as  the  logarithm  of  the  product  of  the  first  (n-p)  terms  in 


ss V 


(II.E.4.9).  This  is  commonly  done.  See,  for  example,  Priestly 
[Ref.  33:  p.  350].  Using  -  c^s  i  gn  (Xn_^ ) ,  we  have  the  following 
single  expression  for  the  conditional  log-likelihood  function,  given  the 
n  realizations  from  TLAR(p)  process,  written  as  a  function  of  a^  for 
i  -  1 , . . .  ,p, 


L(a1 . a)  =*  l  Injf, 


p  i.;+i  xiixi-i . Xi-P 


n  ,  [p  v .  .  J  (x .  J 

l  in  I  •  /Tf,( — -V  , 

i-p*i  /r  (j-i  J  L(/nJ  L(^r)l 


(II.E.4.10) 


where 


v.  .  » 
ij 


Xi  "  Xi-j  if  aj  -  °*  j  =  1 . p' 

Xi  +  Xi-j  if  aj  <  °* 


(II. E. 4.1 1) 


i  *  p  +  1 . .  a.  =  |a.|  and  A  are  functions  of  the  variable  a.. 

J  J  J 

We  see  that  when  p  -  1  (II.E.4.10)  and  (II. E. 4. 11)  give  the 
expressions  used  in  the  TLAR( 1 )  process  in  Section  II. D. 4. 

As  a  function  of  (a^,...,a  ),  (II.E.4.10)  is  continuous 
throughout  the  interior  of  the  p-di  mens  i  onal  subspace  on  which  it  is 
defined.  It  is  not  differentiable  with  respect  to  anywhere  that 
a^  =*  0.  The  maximization  of  (II.E.4.10)  can  be  formulated  as  a 
constrained  non-linear  program  for  which  a  numerical  routine  would 
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III.  CONTINUOUS  RANDOM  COEFFICIENT  MODELS 
WITH  SYMMETRIC  NON-NORMAL  MARGINALS 

A.  INTRODUCTION 

The  discrete  random  coefficient  NLARMA(p,q)  models  studied  in  the 
previous  chapter  offered  a  variety  of  different  dependency  structures 
analogous  to  their  linear  ARMA(p,q)  counterparts  as  described  in  the 
Box-Jenkins  approach  to  time  series  analysis.  These  models,  however, 
could  be  considered  deficient  in  some  ways.  For  one  thing,  all  the 
models  have,  by  design,  the  same  marginal  distribution,  i.e.  Laplace. 
To  obtain  a  different  marginal  distribution  would  require  starting  over 
to  develop  the  appropriate  innovation  sequence.  Raftery  [Ref.  32]  has 
reported  some  results  in  extending  the  NEAR  framework  to  other  models 
with  different  marginals  and  ARMA  correlation  structures. 

Furthermore,  the  parameter  estimation,  which  is  easy  to  do  in 
Gaussian  linear  AR(p)  models,  is  not  particularly  easy  in  the 
autoregressive  process  of  the  NLARMA(p,q)  family.  In  the  moving  average 
and  mixed  models  of  NLARMA(p.q),  the  maximum  likelihood  procedure  is 
even  more  difficult.  Raftery  [Ref.  32]  claims  that  the  maximum 
likelihood  estimator  of  6 1 i n  the  NLA R ( 1 )  process  would  be  super¬ 
efficient  based  on  his  work  in  parameter  estimation  in  the  NEAR(1) 
process  and  the  extensions  that  he  has  proposed.  Super-efficiency  is 
not  an  attractive  property  of  an  estimator. 
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Again,  the  moving  average  model,  NLMA ( 1 ) ,  does  not  allow  for  the 
full  range  of  correlations  that  are  obtainable  with  the  linear  MA(1) 
model . 

Finally,  note  that  there  is  another  attractive  property  of  the 
random  coefficient  models  that  is  not  fully  exploitable  in  the  di screte 
random  coefficient  models  (NEAR(1)  and  NLAR(1)).  That  is,  in  the 
NLARMA(p,q)  models  the  coefficients  of  the  process  can  change  somewhat 
over  time  and  the  process  itself  remains  stationary.  Andel  [Ref.  34] 
has  noted  that  in  many  applications  of  time  series  analysis, 
particularly  in  the  fields  of  hydrology,  meteorology  and  biology,  the 
coefficients  of  the  model  are  attempting  to  describe  complicated 
processes.  The  coefficients  may  have  some  random  behavior  of  their  own, 
apart  from  that  usually  attributed  to  the  independent  innovation 
sequence . 

If  stationary  constant  coefficient  models  are  not  particularly  good 
at  modelling  such  systems  (as  suggested  by  Andel  [Ref.  34]),  then  the 
NLARMA(p.q)  models  would  not  be  much  better  because  the  coefficients  are 
limited  to  a  finite  (very  small)  number  of  possible  values.  However, 
Lawrance  and  Lewis  [Ref.  6]  have  shown  in  the  case  of  NEAR(1)  that  it  is 
possible  to  alter  the  character  of  the  sample  paths  of  a  given  low-order 
autoregression  by  extending  the  two-parameter  model  to  one  having  4 
parameters.  The  number  of  extra  parameters  could  be  excessive  and  the 
costs  in  parameter  estimation  unacceptable. 

In  this  chapter,  a  different  family  of  stationary  random  coefficient 
time  series  models  is  introduced  which  retains  many  of  the  favorable 
aspects  of  the  NLARMA  family  (specified  marginal  and  correlation 
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structure)  and  offers  alternatives  in  the  areas  pointed  out  above  as 
disadvantages  in  the  NLARMA  construction. 

The  symmetric  marginal  distribution  can  be  specified  by  one  shape 
parameter  to  be  any  one  of  an  infinite  number  of  non-Gaussian  distri¬ 
butions.  This  family  is  the  l -Laplace  family  and  is  examined  in  the 
next  section.  The  f amily--including  as  a  special  case  the  double 
exponential  (Laplace)  distribution--has  members  with  extremely  high 
kurtosis,  as  well  as  those  that  have  a  limiting  kurtosis  that  approaches 
that  of  the  Gaussian  distribution.  This  offers  a  significant  advantage 
over  the  NLARMA  models. 

Just  as  discrete  random  variables  are  needed  for  the  coefficients  in 
the  NLARMA  ( p  ,  q )  models,  the  square  roots  of  Beta  random  variables  are 
used  in  this  family  of  models  to  maintain  the  X,-Laplace  marginal  distri¬ 
bution.  The  square  root  Beta  transformation  theorem  is  the  key  result 
through  which  all  the  time  series  models  in  this  chapter  are  formulated. 
By  the  theorem,  Laplace  variables  are  changed  into  those  that  have 
2.-Laplace  distributions.  Previous  uses  of  Beta  random  variables  in 
modelling  non-Normal  time  series  is  evident  in  the  models  with  Gamma 
marginals  of  Lewis  [Ref.  35]  and  Hugus  [Ref.  30]. 

The  fact  that  the  coefficients  are  continuous  instead  of  discrete 
allows  for  a  continuous  variation.  That  they  are  functions  of  Beta 
random  variables  restricts  the  variation  to  a  bounded  interval.  This  is 
likely  to  facilitate  the  modelling  of  those  "complicated"  systems  as 
described  by  Andel  [Ref.  31*]. 
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The  principal  models  investigated  in  this  chapter  are  those  with 
first-order  autoregressive  correlation  structure.  They  are  first-order 
Markov  processes.  For  the  purpose  of  discussing  parameter  estimation  in 
this  family  of  autoregressive  models,  as  opposed  to  the  NLAR(1)  family, 
the  focus  is  narrowed  to  that  AR(1)  model  of  the  family  with  Laplace 
marginals--the  so-called  Beta-Laplace  First-Order  Autoregressive  model, 
BELAR(I).  Several  point  estimators  of  location  and  scale  are  discussed 
and  examined  through  simulation  in  SIMTBED  [Ref.  15].  The  one  parameter 
which  uniquely  determines  all  the  correlations  of  lag  k  in  the  BELAR(l) 
model  can  be  estimated  by  a  least  squares  procedure  which  has  very  nice 
asymptotic  properties.  The  maximum  likelihood  estimator  of  serial 
correlation  is  also  obtained  using  numerical  methods. 

First-order  autoregressive  correlation  structure  is  not  the  only 
type  of  dependency  relationship  that  is  obtainable  from  using  the  square 
root  Beta-Laplace  transformation.  In  the  last  section  of  this  chapter  a 
first-order  moving  average  model  and  an  extension  to  a  qth-order  model 
are  introduced.  The  MA(1)  model  retains  the  full  range  of  correlations 
of  the  linear  MA(1)  models.  This  was  not  the  case  in  the  NLMA(1)  model. 
B.  2.-LAPLACE  DISTRIBUTION 

1 .  The  Z-Laplace  Random  Variable 

It  was  shown  in  Section  II. B  that  the  standard  Laplace 
distribution  belongs  to  the  class  of  infinitely  divisible  distributions. 
The  probability  density  function  of  a  Laplace  distributed  variable  was 
given  in  (II.B.1).  The  character i sti c  function  of  the  standard  Laplace 
random  variable  was  given  in  (II. B. 2).  Thus  if 


X 


l  >  0  , 


(III. B. 1.1) 


(u>) 


<T^' 


f 


then  X  is  a  random  variable.  In  fact  it  is  the  difference  of  two 
independent,  identically  distributed  GammaU.I)  random  variables  where  l 
is  the  shape  parameter  and  1  is  the  scale  parameter.  Therefore,  if  X 
has  a  characteristic  function  given  by  (III. B. 1.1),  then  X  is  an 
ft-Laplace  random  variable. 

Since  (III. B. 1.1)  is  a  real  function  of  u,  X  is  a  symmetric 
random  variable.  It  is  easily  verified  that 


£(Xn) 


0  if  n  is  odd, 

(k+1  )£k^k^  if  n  =  2k,  k 


(III.B.1 .2) 

1,2,..., 


[k  1 

where  b  =  b  (b+1  )  . . .  (b+k-1 )  for  all  b  >  0.  Since  all  odd  moments  are 
zero  in  (III. B.  1.2),  the  Ji-Laplace  distribution  is  not  skewed  for  any 
i  >  0.  From  (III.B.1. 2)  we  find  that  the  kurtosis  is 


E<1<n> 


( E(X  ) ) 
n 


Var2(X  ) 
n 


3[2V2]  _  3 

Vk%) 2  ~  5  i 


(III.B.1. 3) 


The  kurtosis  approaches  3  as  l  +  which  corresponds  to  that  of  a 
Normal  distribution. 

Since  an  2,-Laplace  random  variable,  X(Jl)  ,  is  the  difference  of 
two  i.i.d.  Gamma(£,1)  random  variables,  we  obtain  the  density  for  XU) 


by  using  conditional  expectations. 


If  G ^  ( £ , 1 )  and  G2(2,1)  are  the  i.i.d.  Gamma(2,1)  random 
variables,  then  conditioning  on  G^ ( il,  1 )  ,  we  have 


P(X<x)  =  P(G1-G2<x)  =  Eg  {P(G1-G2<x|G2=g)} 

=  Eg  { P(G1 <x+g) } 


(III.B. 1.4) 


Since  Gamma  random  variables  are  non-negative, 


P(G1  <x+g) 


0  if  g  <  -x, 

F  (x+g)  if  g  >  x, 

G, 


( III.B. 1 .5) 


where  F„  (x+g)  is  the  cumulative  distribution  function  of  G..  The 
G1  1 

expressions  are  shortened  from  G  ( JL ,  1  )  to  G  ( 2, ,  1  )  ,  because  they  are 
i.i.d.  Therefore,  (III.B. 1.4)  can  be  written  as 


g=00 

P(X<x)  =  j  FQ( x+g )f Q(g )dy ,  (III.B. 1.6) 

g=L( x ) 


where 


g  exp(-g)  g  >  0, 

r(  2.) 

0  otherwise,  (II. B. 1.7) 

is  the  density  of  a  Gamma  (2,1)  random  variable  and  again,  because  of 


f r(g ; 2)  = 


the  non-negativity 


i  -x  if  x  <  0. 


(III.B. 1.8) 


Differentiating  (III.B. 1.6)  using  Leibniz'  rule  for  the 
derivative  of  an  integral  with  variable  limits,  we  have,  after  some 
simplification 


O 

i;l)  =  \ 


g=L(u) 


r2(£)  [g(g+u) 


exp{-(2g+u)}  dg.  (III.B. 1.9) 


Now  if  i  is  a  positive  integer,  (III.B. 1.9)  can  be  evaluated 
analytically  using  integration  by  parts.  If  £  =  1  we  obtain  the  density 
of  the  standard  Laplace  distribution.  For  £  =  2,3,  11  the  densities  are 
also  well  known  derivations  given,  for  example,  as  textbook  problems  by 
Feller  [Ref.  25:  p.  64].  Feller  however  looks  at  the  results  of 
(III.B. 1.9)  as  the  n-fold  convolution  (n  =  2,3,4)  of  i.i.d.  standard 
Laplace  random  variables.  Figure  III.B. 1.1  shows  the  densities  of  the 
£ -Laplace  random  variable  for  £  =  1,2, 3, 4.  Note  how  the  graphs  take  on 
the  shape  of  a  Normal  density  with  o2  =  21. 

2.  Numerical  Evaluation  of  the  £-Laplace  Density 

If  £  >  0  and  is  not  an  integer,  then  (III.B. 1.9)  must  be 
evaluated  numerically.  We  will  be  interested  in  the  evaluation  of  the 
density  in  (III.B. 1.9)  for  0  <  £  <  1,  in  order  to  calculate  conditional 
densities  and  likelihood  function. 


Figure  III. B. 2.1  displays  examples  of  densities  for  non-integral 
£  obtained  by  using  the  IMSL  numerical  integration  scheme  DCADRE  to 
evaluate  (III. B. 1.9).  The  upper  limit  of  integration  in  (III. B.  1.9)  is 
replaced  by  a  suitable  constant  m  >  1 .  Since  for  g  >  1  and  fixed  Z  and 
u  >  0, 


r2(£) (g(g+u)  J 


exp{ -(2g+u) } 


exp { ~(u+2g ) ) 

r2U) 


(III. B. 2.1) 


then 


|DCADRE-fx(u;£)  |  <  •  (III. B. 2. 2) 

Difficulty  in  integrating  comes  about  because  of  the  singularity 
at  the  lower  limit  of  integration.  If  Z  Z  1  ,  this  singularity 

i-i.  si-i 

disappears  by  rewriting  (1/(g(g+p))  as  (g(g+p))  .  For  Z  <  1,  there 

are  two  alternatives  for  removing  the  singularity.  We  can  transform  the 

l 

variable  of  integration,  g,  to  become  t  =  g  and  the  singularity  at 
g  =  0  is  removed.  Or,  we  could  do  an  integration  by  parts  to  remove 
either  the  singularity  at  g  =  0  for  u  >  0  or  at  g  =  -u  f or  u  <  0 .  In 
either  case,  the  remaining  integral  must  be  evaluated  numerically  for 
u  *  0. 

Since  X  is  a  symmetric  random  variable  we  can  rewrite 
(III. B. 1.9)  using  integration  by  parts  to  obtain  an  expression  that  will 


be  easier  to  apply.  For  all  u  *  0 


re 


If  &  £  .5  note  that  f^Cu)  is  not  defined  at  p  =  0.  For  l  >  . 5  and 
u  -  0, 


?0  -1 

fx(0;H)  =■  r(  2S.-1 )  /  {  r2  ( «.)  2  }  <  »  .  (III. B. 2. 4) 


3.  The  Square  Root  Beta-Laplace  Transformation 

The  principal  result  of  this  section  is  the  proof  of  the  so- 
called  square  root  Beta-Laplace  transformation  theorem.  By  this 
technique,  an  ^ -Laplace  random  variable  can  be  transformed  into  an 
i^-Laplace  random  variable  where  £  JL  .  The  time  series  models 
developed  in  Sections  III.C  -  III.F  rely  on  the  following: 

Theorem: 

Let  X  -  l -Laplace  and  B  -  Beta(a, 2,-a) ,  where  0  <  a  <  l  and  B  is 

1  /p 

defined  on  the  interval  [0,1],  i.e.  standard  Beta.  If  Y  =  B  X,  then 
Y  -  a-Laplace. 

Proof : 

By  conditioning  on  B,  we  obtain  the  following  expression  for  the 
characteristic  function  of  Y; 
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EgC  E{ exp ( i b  Xw)}] 


EB[{l/(l+bu)2  )}*]. 


(III. B. 3.1) 


Since  bui2  >  0,  a  convergent  power  series  representation  of 
(III. B. 3.1)  is  given  by 


"  oLr-J  k  k 

t>Y(w)  =  E  {  I  (-“2)  bk}, 

x  B  k=0  K‘ 


(III. B. 3. 2) 


where  again  =  l(M)...(l+k-1)  for  k  =  1,2,...;  HI-0]  =  1. 

Interchanging  the  expectation  and  summation  in  a  convergent 
power  series  gives 


r  k  k 

0Y(«)  =  [  (-a,2)*  E(BK), 

k=0 


(III. B. 3. 3) 


From  Johnson  and  Kotz  [Ref.  36:  v.  2,  p.  40],  we  have 


E(Bk)  =  for  k  integer. 


Substituting  (III. B. 3. 4)  and  (III. B. 3-3).  we  have 


<fh,U)  =  y  2__  (-0)2)k  =  (t^-t)3. 


(III. B. 3. 4) 


(III. B. 3. 5) 


C.  i-LAPLACE  FIRST-ORDER  AUTOREGRESSIVE  TIME  SERIES  MODEL 
1 .  Introduction 

In  this  section,  we  exploit  the  square  root  Beta-Laplace 

transform  to  define  a  2-parameter  first-order  autoregressive  model  in 

5,-Laplace  variables.  The  first  parameter,  l,  determines  the  non- 

Gaussian  symmetric  marginal  distribution  of  the  time  series  ensemble. 

The  second  parameter,  a,  given  the  value  of  l,  determines  uniquely  the 

lag-1  serial  correlation.  Since  the  model  is  shown  to  be  first-order 

Markovian,  a  determines  the  entire  autocorrelation  function  up  to  the 

sign.  We  show  also  that  the  models  are  always  partially  time  reversible 

with  respect  to  both  runs  probabilities  and  directional  moments. 

Writing  the  stationary  time  series  {X  (5,)}  in  the  form  of  an 

n 

additive  random  coefficient  equation,  we  have 

X  U)  -  A1/2(a,8.-a)X  .(SO  ♦  B1/2U-a,a)L  (SO,  (III. C. 1.1) 

n  n  n-1  n  n 

where  {X  (i)}  is  assumed  to  be  a  stationary  time  series  with  a  marginal 

1  /2 

5,-Laplace  distribution;  (A  (a,£.-o)}  is  an  i.i.d.  sequence  such  that 

n 

1  /2 

An(a,(L-a)  is  a  standard  Beta;  (B  (5.-3, a)}  is  an  i.i.d.  sequence 

1  /2 

independent  of  { A^  (a,5,-a)}  such  that  Bn(i-o,a)  is  also  standard  Beta; 

and  (1^(5.)}  is  an  i.i.d.  sequence,  independent  of  the  coefficient 

processes,  such  that  L  ( SO  is  5,-Laplace.  The  coefficient  A  (a,  5. -a)  and 

B  (SL-a,a)  are  assumed  to  be  independent  of  X  ,  ,  X  _,  etc.  If  it  is 
n  n-1  n~2 

assumed  that  X^  1  ( 5. )  has  a  5,-Laplace  distribution,  then  by  the  theorem 
in  Section  III.B.3  so  doe3  Xn(50.  The  fact  that  the  process  is 
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Markovian  follows  by  construction.  To  start  the  process  in  the 
stationary  distribution  set  XQ(d)  -  LQ(d)  . 

The  parameter  space  is  d  >  0  and  0  £  a  £  2,. 

For  the  Beta  random  variables  A  and  B  (hence  their  square 

n  n 

root3)  to  be  properly  defined,  each  of  the  parameters  must  be  positive. 

Hence,  when  a  =  0  or  a  =  d,  (III.C.l.i)  as  defined  above  is  no  longer 

1/2  1/2 

appropriate  because  each  of  A  and  B  has  a  parameter  that  is 

n  n  * 

1  /2 

identically  zero.  If  a  =  0,  it  is  understood  that  the  {A  }  sequence 

n 

1  /2 

is  identically  zero  and  the  {  8n  }  sequence  is  one;  therefore, 

(III.C.l.i)  becomes  X  (d)  =  L  ( d)  and  the  (X  }  sequence  is  the  (L  } 

n  n  n  n 

1  /2 

corresponding  to  the  i.i.d.  d-Laplace  case.  For  a  =  d,  A^  is  one  and 

1  /2 

B  is  identically  zero;  therefore,  if  X.  -  Ln(d),  then  X  is 
n  u  u  n 

d-Laplace,  but  is  not  an  ergodic  process. 

If  we  let 


1  /? 

en  =  %  (*-a.a)LnU)  (III. C. 1.2) 

then  by  the  Theorem  in  Section  III.B.3,  -  (  l -a ) -L  apl  ace  with 

E(e  )  =0  and  Var(e  )  =  2(£-a)  for  all  n.  Since  the  variance  must  be 
n  n 

non-negative,  it  is  also  necessary  that  a  S  l.  By  the  stationarity  of 

1  /2 

(X  }  and  since  A  (a, i-a)X  ( d )  is  independent  of  e  ,  the 


characteristic  function  of  the  right-hand  side  of  (III.C.l.i)  gives 


Examples  of  sample  path  behavior  for  selected  1  and  a  are  given  in 


Figure  III. C.  1.1.  Note  that  although  the  correlation  coefficient  is 
approximately  0.8  for  all  sets  of  l  and  a  in  Figure  III. C.  1.1,  there  is 
considerable  difference  in  the  sample  path  behavior  as  l  changes.  For 
the  samples  from  small  values  of  £,(.10  and  .05).  there  are  runs  of 
values  that  are  very  nearly  zero  in  magnitude. 

2.  Correlation  Structure 

Using  equation  (III. C. 1.1)  recursively  along  with  the 
stationarity  and  independence  of  the  process  {X^},  we  have 

E{ X  ( £,)X  U)} 

p(1)  -  Corr(XnU),Xn_lU)  -  W  --ri  y . 


E[  {A^/2(a,  £,-a)Xn_1  (iD+e^X^  U)  ] 

E{X*  ,(£)'} 
n-1 

E{Ay2(af£-a)}EtX*  U) } 

=  - - - — (UTLJ -  =  E{n  (ct'il'c‘)}-  (HI. C. 2.1) 


From  Johnson  and  Kotz  [Ref.  36:  v.  2,  p.  40],  we  have  for 
-  Beta(a, l-a) ,  that 

E(  A^(a,  2. -a) )  =  for  all  n,  r  >  0,  (III.  C.  2. 2) 


where  f(.)  is  the  incomplete  Gamma  function.  Therefore 


(III. C. 2. 3) 


m  r(a+i  /2)  r( £)  ar(a+i  /2)r(£+i ) 

1  "  r(M/z)r(a)  “  irU+i /2)r(a+D* 

Note  that  as  a  -*•  £,  then  p ( 1 )  -*■  1.  Similarly  as  a  -*■  0,  p(1)  -*■  0. 

Therefore,  we  obtain  a  full  range  of  positive  correlations  in  a 

one-to-one  function  of  a  for  any  given  value  of  i. 

Also  from  (III. C.  1.1),  we  see  that  the  process  is  explicitly 

autoregressive.  It  is  also  autoregressive  in  the  sense  of  expectations 

in  that  E(X  (£)lx  ,(£)  -  x)  is  a  linear  function  of  x.  Since 
n  '  n- 1 

I  k  I 

(III. C. 1.1)  defines  a  first-order  Markovian  process,  p(k)  »  p(l)1  1  for 
all  k.  To  see  this  we  write  for  all  k 


=  p( 1 ) p( k-1 ) 


-  p(  1 ) p(k-2) p(  1 ) 

=  (pd)}^'. 

1/2  1/2 
If  we  replace  (a,H-a)  in  (III. C. 1.1)  by  -A  (a,l-a)  we  have 


p(1) 


f(q-*-1  /2)  f(  l) 
r(  £  +  1  /2)  f(a) 


af  (  a  +  1  /2)  f(  £-*-1  ) 
ir(  £+i  /2) r(o+i ) 


(III. c.2.5) 


We  can,  therefore,  achieve  a  full  range  of  negative  correlations,  and 


likewise 

p(k)  =  (-1)  lkl{p(1)} ^  for  all  k. 


(III. C. 2. 6) 


3.  Partial  Time  Reversibility 

The  2,-Laplace  first-order  autoregressive  models  are  partially 

time  reversible,  both  with  respect  to  the  directional  moments, 

[X2U)X  (1)}  for  m  =  0,  ±1,  ±2,...,  and  with  respect  to  runs 

n  n-m 

probabilities,  P{X  U)<X  .  U) }  =  P(X  (2.)>X  ,(l)). 

n  n-1  n  n-1 

Using  mathematical  induction,  stationarity  of  {X  (2,)},  and  the 

independence  of  the  coefficients  and  the  innovation  from  each  other  and 

previous  values  of  (X  (2.)},  it  is  the  case  that  (X2(2.)X  (2,)} 

n  n  n-m 

=  E {  X  (2)X2  (2,)}  =  0  for  all  n  and  for  all  m  =  0,  1,  2 .  Let 

n  n-m 

X^~  2,-Laplace.  For  m  =  0,  E(X2)  =  0  by  (III.  B.  1.2).  Assuming  for  m  =  k 
tha.  E^XnXn-k^  =  we  have  £’or  m  =  k  +  1  after  substituting  from 
(III. C.  1.1)  and  (III.C.1 .2)  that 


1  /? 

E{  X  2X  =  E{  ( A  X2  +2  A  X  ,e  +e2)X 

n  n-(k+1)  n  n-1  n  n-1  n  n  n-(k+1) 


E(  A  )  E{  X2  ,X  .  } 

n  n-1  n- (k+1 ) 


=  E(  A  )  E( X 2X  ,  )  =  0. 
n  n  n-k 


(III.C.3. 1 ) 


Assuming  for  m  =  k  that  E(X  X2  ,  )  =  0,  we  have  for  m  =  k  +  1  after 

n  n-k 

substituting  '■gain  from  (III. C.  1.1)  and  (III.C.1. 2) 


(III.  C.  3. 2) 


■  J  'l  i  .  I 1. 


v,  v V  ^ 


E{X  X2  .  } 
n  n-(k+1 ) 


E{X2  ,„.,x(Ay2X  +en )} 
n- (k+1 )  n  n-1  n 


E(  A1  /2)E{X2  .  ,X  . } 

n  n-  (k+1 )  n-1 


E(»’/2)E(X*_kXn)  -  0. 


To  see  that  this  model  is  also  partially  time  reversible  with 

respect  to  runs  probabilities,  we  show  that  the  random  variable  =  X^ 

-  X  ,  is  symmetric.  Now  Z  is  symmetric  if  and  only  if  the 
n-1  n 

characteristic  function  of  Z  is  real  valued.  We  write 

n 


t>z(u>)  =  E[exp{iui(Xn  -  Xn-1  )}] 


=  E[exp[iui{en-(l-A^/2)Xn_1  }  ]  ] 


1  /? 

E{exp(iuj£n)}E[exp{-ia)(1-An  )xn_.,  H 


T^ra  EA[E[exp{-iw(1-a1/2)Xn_1}]] 


1  +o) 


E. 


,  1/2,2  2 

1  +  ( 1-a  )  w 


(III. C. 3- 3) 


Since  (III. C. 3. 3)  is  real  valued  that  concludes  the  proof. 
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D.  THE  BETA -LAPLACE  AUTOREGRESSIVE  MODEL,  BELAR(I) 

1 .  Introduction 

In  this  section,  we  set  i  =  1  in  (III. C. 1.1)  and  (III. C. 1.2)  to 
obtain  the  following  expression  for  the  BELAR(I)  process 

1  /? 

X  =  A  (a,1-a)X  +  e  ,  (III. D. 1.1) 

n  n  n-1  n 

where  {e  }  is  an  i.i.d.  sequence  with  e  -  ( 1-a)-Laplace  with  moments 
n  n 

and  density  given  by  (III. B. 1.2)  and  (III. B. 2. 3)-  X^  now  has  a  standard 

Laplace  marginal  distribution.  The  only  parameter  in  the  model  is  a 

with  0  <i  a  £  1.  All  the  results  of  Section  III.C  still  hold  with  1  =  1. 

Examples  of  sample  path  behavior  are  given  in  Figure  III. D. 1.1. 

We  do  two  things  in  this  section.  First,  we  derive  the 

equations  for  the  conditional  density  of  X  |X  .  .  The  second  is  the 

n  1  n- 1 

derivation  of  joint  density  and  the  logarithm  of  the  likelihood 
function.  The  expression  is  used  in  Section  III.E.6  to  obtain  the 
maximum  likelihood  estimate  for  a. 

2.  The  Conditional  Density 

To  find  the  conditional  density  of  X  |X  .,  we  will  need  the 

n  1  n-  1 

1  /2 

density  of  A  (a,l-a).  Let  A  be  a  standard  Beta  random  variable  with 
n  n 

parameters  ( ot ,  1  -  cx )  .  Since  A  is  defined  only  on  the  given  interval, 


zero  to  one 


E  PATHS 


Figure  III.  D.  1.1.  BF.LAR  (1 )  :  Sample  Paths  for  Specified  Values  of  a  and 

Corresponding  p(l)=y 


0 


otherwi se 


P(A  <a)  = 
n 


x=a‘ 


x=0 


fA  (x;a)dx, 
n 


0  <  a  <  1  , 


(III. D. 2.1) 


where  f  (x;a)  is  the  standard  Beta(a,1-a)  density  given  by 
n 


f*A  (x ;  a) 
n 


a01-1  (1-a)a/r(ct)r(1-a)  0  <  a  <  1, 

0  otherwise.  (III. D. 2. 2) 


Dif f erentiating  (III. D. 2.1)  with  respect  to  a,  we  obtain  the  following 
expression  for 

f.1/2(aia>  •  r~a)'r(l'-a)  TTXS  ■  0  <  a  <  1 '  (III. D. 2. 3) 

An  (1-a  ) 


Examples  of  (III.D.2.3)  ar’e  given  in  Figure  III. D. 2.1. 

Now  we  evaluate  P(X  <  x  |  X  _-|=y)  using  (III. D. 1.1),  (III.  B.  1.2), 

1  /2 

and  (III. B. 2. 3).  Conditioning  on  A  (a, 1-a)  we  obtain 


DENSITY 


Figure  III. D. 2.1.  Examples  of  the  Density  of  A  (a,l-a)  for  Specified 

Values  of  0<a<l  1 


P(Xn<x|Xn.,-y)  .  PlAn' Vi  *  E„  <  Xlxn-ryl 


E  1/2CP(En  <  X  -  yA^ta.I-oOlA^-a}] 
A 

n 


E,1/2fPUn  <  x'as’>) 

A 

n 


E  {Fe  (x-ay)) 

A  n 

n 


a=L2(x) 


=  [  F£  (x-ay)  f  /2(a;a)  da  ,  (III. D. 2. 4) 

,  f  \  £n  A1 

a=L.(x)  n 


where  from  (III. B. 2. 3)  the  cumulative  distribution  function  of  can  be 


written  as 


F  (x-ay)  = 
e 

n 


u=x-ay 


+  f  (u;1-a)du  if  x-ay  £  0, 


(III. D. 2. 5) 


u=ay-x 


f  f  (u;1-a)du  if  x-ay  <  0, 


and  Li(x),  i  =  1,2  are  the  limits  of  integration  on  a  which  may  be 
functions  of  x. 


Since  F  (x-ay)  changes  definition  for  negative  and  positive 


I 


h 


(x-ay)  and  since  0  <  a  <  1  ,  we  rewrite  (III.D.2.4)  based  on  the  ratio 
x/y,  which  is  a  constant.  Thus 


a-1 


p(vxiv,-y> 


I  F  (x-ay)f  1/2(a;a)da  if  x/y  £  1  or  x/y  £  0; 

en  A 

a-0  n 


( III. D. 2. 6) 


a=x/y 


a-0 


F  (x-ay)f  1/2(a;a)da 
en  A 

n 


a=1 


+  f  F^  (x-ay)f  1/2(a;o)da  if  0  <  x/y  <  1 

•  ^  A 


.  "n  A 

a=x/y  n 


Differentiating  (III.D.2.4)  with  respect  to  x  using  Leibniz' 
rule  gives  the  following  general  expression  for  the  conditional  density. 
We  have 


rx  |x  <*|y>  ■  s{p(V*lxn-i-y)1 

n  1  n-1 


a=L2(x) 

=  f  f  (x-ay; 1 -o)f  1/2(a;a)da 

,  ,  .  n  A 

a-L  (x)  n 


+  F  {x-yL  (x)}f  {L  (x);a)  — L  ( x ) 
t  2  .1/22  dx  2 

n  A 

n 
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■xr*-j 


r 7 TT 


7  •’TTT 


^V 

»*/• 


i 

i 

ii<- 

i  ■- 


F  (x-yL  (x)f  /2iL  (ilialgjL  <x). 
n  A 

n 


(III. D. 2. 7) 


Fran  (III. B. 2. 3),  (III. D. 2. 3)  and  (III.D.2.5)  set 


h(g,a)  ■  Za^'^xpl-CZ^Ix-ayll)  t'^^Z  |  x-ay  |  nO  _  (m.D.2.6) 

r3( 1-a)  r(o)  ( 1 -a2 ) a  (g  + j  x-ay | )1 +a  (1-a) 

Now  using  (III. D. 2. 7)  to  differentiate  each  expression  in  (III. D.  2. 6), 
we  have  the  following  explicit  expressions  for 


a=1  g=» 

j  j  h(g,a)dgda  if  x/y  £  1  or  x/y  £  0  , 
a=0  g=0 


rx  |x 

n 1  n-1 


( III. D. 2.9) 


a=x/y  g=« 

|  j  h(g ,a)dgda 
a=0  g=0 

a-1  g=» 

*  j  j  h(g,a)dgda  if  0  <  x/y  <  1  . 
a=x/y  g=0 


It  will  be  seen  later  that  working  with  (III. D. 2. 9)  will  be 
inconvenient.  Hence,  we  rewrite  (III. D. 2. 9)  as 


The  conditional  density  in  (III. D. 2. 10)  can  assume  different 

shapes  as  a  function  of  x  depending  on  the  fixed  conditioning  value,  y, 

and  the  particular,  fixed  a.  If  a  3  0,  then  (III. D. 2.  10)  becomes  the 

standard  Laplace  density  as  given  in  (II. B. 1.1)  with  p  =  0  and  A  =  1. 

If  y  =  0,  then  (III. D. 2. 10)  becomes  the  ( 1-a)-Laplace  density  as  given 

in  (III. B. 2. 3)  with  l  =>  1-a.  In  Figure  III.D.2.2  are  presented  different 

examples  of  (III. D. 2. 10)  for  a  fixed  y  and  different  values  of  a.  Note 

that  if  a  <  1/2  then  (III. D. 2. 10)  is  continuous  for  all  x.  If  a  i  1/2 

and  x  =  y,  (III. D. 2. 10)  is  undefined,  e.g.,  x  =  y  =  0. 

In  a  similar  manner,  expressions  for  ( III.D .  2. 4)  -  ( III .  D .  2. 1 0) 

can  be  derived  for  the  BELAR(I)  model  with  negative  correlat i ons  . 
1  /  2 

Placing  -  A  ^  (a, 1-a)  in  (III.D. 1.1),  we  replace  x-ay  by  x+ay  and 

determine  the  appropriate-  form  of  the  conditional  density  based  on  the 
ratio  (-x/y).  We  have  for  the  negative  BELAR(I)  process 
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fx  jx  (x|y) 
n1  n-1 


a-1 


f  f  { (x+ay) ; 1 -a}f  1/2(a;ot)da 

n  £n  A 

=0  n 

if  -x/y  £  1  or  -x/y  £  0, 


a«-x/y 

f  f£  {(x+ay) ,1-a}f  1/2(a;a)da 


(III. D. 2. 11) 


a=1 

+  f  f  { (x+ay ) ;  1  — ot }  f  1/2(a;a)da 

■*  .  en  A 

a-  -x/y  n 

if  0  <  -x/y  <  1 . 


3. 


using  f 


The  Joint  Distribution  and  the  Likelihood  Function 

An  expression  for  the  joint  density  of  X  ,  can  be  written 

lv  (x  |x  J  and  fv  <xi)  as  follows: 

.  X  ,  n1  n-1  X,  i 

n1  n-1  1 


n-1 


fX  . .  .X,  (xn . X1  }  fX  (X1  )  11  f 

n  1  Ik 


,  xX  .  lx  ,  (xn-(k-1)  lxn-k)* 
1  n-(k-l)1  n-k 


(III. D. 3.1) 


The  log-likelihood  function  as  a  function  of  a  given  {X^}  is  just  the 
natural  logarithm  of  (III.D.3-1)  .  We  have 


n-1 

L(ct)  -  -(In  2  ♦  |x  |)  +  l  ln(f  ,  (xn-(k-1 ) |Xn-k} } * 

1  k-1  n-(k-1)|Vk  n  U  i;i  n  K 


(III. D. 3. 2) 


for  odd  lags.  We  just  check  for  0  <  -x/y  <  1  and  choose  the  appropriate 
value  of  c  in  (III. D. 4. 6)  and  (III. D. 4. 8)  where  x-ay  is  replaced  by 
x+ay . 

b.  The  Methodology 

Attempts  to  evalute  the  conditional  density,  as  given  by 
(III. D. 2. 8)  and  (III. D. 2. 9),  using  the  standard  IMSL  double  integration 
routines  failed.  Even  the  IMSL  routine  DBLIN  which  is  often  successful 
in  handling  ill-behaved  integrands,  was  unable  to  evaluate  (III. D. 2.8) 
around  the  singularities.  For  a  <  1/2,  along  the  lines  a  =  0  and  a  =  1, 
(III.D.2.8)  is  unbounded.  Similarly  for  a  £  1/2,  along  the  line  a  =  1 
and  at  the  pcin*-  (g,a)  =  (0,x/y)  for  0  <  x/y  <  1,  (III. D. 2. 3)  is 
unbounded.  Arbitrarily  declaring  (III.D.2.8)  to  be  zero  under  these 
conditions  did  not  always  allow  DBLIN  to  accurately  evaluate 
( III. D. 2. 9)  . 

We  succeeded  in  evaluating  the  conditional  density  by 

working  with  the  form  ^iven  by  (III. D. 2. 10)  with  f  (a;o)  given  by 

A  U 
n 

(III. D. 2. 3)  and  f  {(x-ay);1-a}  given  by  (III. B. 2. 3).  We  used  the  IMSL 
en 

routine  DCADRE  to  construct  an  extensive  table  of  values  for  the  ( 1  — ct )  — 

Laplace  density  with  the  intention  to  linearly  interpolate  from  the 

table  as  needed.  The  error  in  the  value  of  f(|u| ;1— a)  in  the  table  is 

controlled  by  DCADRE.  The  error  in  the  value  of  f  (|u_|  ;1— a)  obtained 

e  O' 

by  using  linear  interpolation  for  |Uq|  not  in  the  table  is  calculated  in 
the  standard  way.  From  Gerald  [Ref.  28:  p.  168] 


dzf£  ''c-.l-a) 

j  Error  Interpolation]  -  |h  3  1  )  - j - j  f  (III. D.  4.1) 

where  h  is  subinterval  length  and  s  =  (Ug-u)/h.  Substituting  the  second 
divided  difference  into  ( II I . D . M . 1 )  ,  in  place  of  the  unknown  second 
derivative  and  also  noting  that  the  worst  case  for  linear  interpolation 
is  at  the  center  of  the  subinterval,  we  have 

|  Error  Interpolation]  <  j- g  A2f  ( |  u  | ;  1  — ot)  ,  (III.  D.  4. 2) 

'  en 

where  A2f  is  the  second  difference.  Because  f  ( |  u  | ;  1  — ot )  is  non- 
e  e  1  1 

n  n 

negative  and  monotone  decreasing  in  |u|,  the  largest  values  of  A2f  are 
in  su^intervals  close  to  zero.  The  table  that  was  constructed, 
therefore,  uses  smaller  subintervals  close  to  zero  and  larger 
subintervals  further  out. 

Finally  we  used  DCADRE  again  to  evaluate  (III. D. 2. 10)  except 
near  the  singularities,  which  we  were  able  to  evaluate  analytically  and 
then  add  back.  The  technique  is  often  referred  to  as  "removing  the 
singularity" . 

c.  Removing  the  Singularities  Due  to  (III. D. 2. 3) 

We  now  describe  how  we  evaluated  the  integrals  in 

(III. D. 2. 10)  in  the  vicinity  of  the  singularities  in  (III. D. 2. 3)*  We 

1  /2 

see  that  the  density  of  A^  (a,1~a)  given  in  (III. D. 2. 3)  is  undefined  at 
a  =  0  and  a  =  1  for  a  <  1/2  and  at  a  =  1  for  a  Z  1/2.  We  also  note  from 
(III. D. 2. 3)  that  for  small  6  >  0  and  a  <  1/2 


fAi/2(a;ct)  :  r(o)rd-o) 

n 


,  0  <  a  <  6; 


(III.D.4.3) 


and  for  all  0  <  a  <  1 


f  (a;a) 
A 

n 


2a 


r(a)r(1-a) (1-a2)‘ 


1-5  <  a  <  1  .  (III. D. 4. 4) 


Therefore  for  a  <  1 /2  and  1  £  x/y  or  x/y  £  0  we  have  from  (III.D.4.3) 


a-6 


a=5  2a-1 

[  f  1/2(a;a)f  { ( x-ay ) ;  1  -a}da  ;  [  pTT'ffrfl'y  fe  ( (x-ay) ;  1  -a}da  . 

n  A  n  ;  n 

3-0  "  3-0  (III. D. 4. 5) 


Since  f  (•)  is  continuous  in  this  situation,  there  exists  a  number  c  so 


that  0  <  c  <  6  and  |x|  £  j  x— cy |  £  J  x— 6y  |  and 


a=6  _  2a-1 

r  2  a 


a=6  _  2a-1 

2a 


j  r(<,)r(l—  )fc  Kx-^Ji'-alda  ■  f£  Kx-cyhl-a)  j  f(WrRl) 

a-0  "  n  a-0 


da 


=  f£  { (x-cy ) ; 1 -a}  r( 2-a) r( 1 +a)  * 
n 


(III. D. 4.6) 


A  natural  approximation  for  c  allows  |x-cy|  to  be  the  average, 
(  1  12 ) | 2x-5y  |  . 


For  all  m  and  1  <  x/y  or  x/y  <  0  we  have  from  (III. D. 4. 41 


j  f  1/2(a;a)f  { ( x-ay ) ; 1 -a }da 

,  ,  A  n 

a=1-6  n 


a=1 


1 


2a 


a=1-6 


r( a) r( 1 -a)  ^ i _a2  ja 


f  { (x-ay) ; 1 -a}da  .  (III. D. 4. 7) 


Likewise  there  exists  a  new  number  c  so  that  1-6  <  c  <  1  and 
I x— y |  <  | x— cy |  <  |x-y+6y|  and 


a=1 


a-1-6 


1 


2a 


r(o)r(1-a)  ^_a2ja  eR 


f  { (x-ay ) ; 1 -a}da 


a=1  .  ? 

f  { (x-oy) ;  1  -a}  f  -=7—7-7-: r  ( - )da 

en  J  r(o)r(1-a)  ^ ( 1 -a2) a 

3  —  1  “*6 


N  ,  a( 26) ^  a 

fe  Hx"Cy);1"a}  r(  1  +a)  r( 2-a) 

n 


( III. D. 4.3) 


Again  a  natural  approximation  for  c  allows  [  x  —  c  y  j  to  be  the  average, 
(1/2) | 2(x-y)+6y  |  . 

d.  Removing  the  Singularity  Due  to  (III. B. 2. 3) 

The  final  type  of  singularity  occurs  when  0  <  a  =  x/y  <  1 

and  a  Z  1/2.  When  this  situation  occurs  we  leave  f  (•)  under  the 

e 

n 

integral  and  argue  that  in  a  6 -nei gh bor hoo d  around  x/y  <  1, 


f  .  .»(a;ot)  *  f  ,  ,„(x/y;a).  Note  that  by  the  same  argument  that  gave  us 


(III.  D.  4. 6)  and  (III. D. 4. 8),  there  exist  two  numbers  and  so  that 


(x/y)-6  £  c1  £  x/y  and  x/y  £  c2  £  (x/y)+6 


a=x/y 

|  f  1/2(a;a)f£  { (x-ay) ; 1 -ct}da 

J  N  .  A  n 

a=(x/y)-6  n 


a=( x/y ) +5 

+  f  f  1/2(a;ct)f  { (x-ay ) ;  1  -ct}da 

.  A  en 

a=x/y  n 


a=x/y 

=  f  ^(c^a)  f  f  { (x-ay) ;  1 -a}da 

n  a=(x/y)-6 


a=( x/y) +5 

+  f  1/2(c2:ot)  f  fe  ( ( x-ay ) ;  1  -a}da  .  (III.  D.  4. 9) 

A  *  .  n 

n  a=x/y 


We  chose  to  approximate  c1  and  c2  both  by  x/y  for  x/y  *  ±  1  ,  and  have 
f  _(x/y;a)  <  00  for  all  a.  If  x/y  =  1  or  x  =  0  and  y  =  0  simultane¬ 


ously,  the  value  of  (III. D. 2. 10)  is  undefined  for  a  >  1/2. 

Now  changing  the  variable  of  integration  so  that  (x-ay)  =  u, 
we  have  from  (III. D. 4.9)  that  for  all  a  >  1/2 


a=x/y 


a=( x/y ) +6 


I  f£  { (x-ay ) ; 1 -a}da  =  J  f£  ( ( x-ay ) ; a} da  , 

a=(x/y)-6  n  a=x/y  n 


f  (u;1-ct)du 
en 

*  Cj)  .  (III.D.H.10) 

because  f  (•)  is  a  symmetric  density.  That  is  (III. D. 4. 10)  is  an 
n 

expression  for  jyy  P ( 0  <  <  | y 6 |  )  where  is  the  ( 1 -a) -Laplace 

innovation  random  variable.  Therefore,  we  add  back  to  the  DCADRE  result 
the  amount 

(jyr)f  1/2(x/y;a){P(0  <  en  <  |yfi|  )}  £  f  1/2(x/y;a)  <  -  .  y  -  0. 

n  n  (III. D. 4. 11) 

We  choose  the  following  combination  as  the  value  for 
P(0  <  en  <  jySj  ). 

i)  Using  the  trapezoidal  rule  and  the  table  of  values  for 
the  ( 1 -a) -Laplace  density  we  found 

u=M 

p 1 ( 0  <  en  <  j y 6 1  )  =  1/2  -  f  f  (u;1-a)du  .  (III. D. 4. 12) 

u=| y  6 1  'n 

Equation  (III. D. 4. 12)  is  the  average  of  the  upper  and  lower  Riemann  sums 
of  the  tail  of  the  density  subtracted  from  1/2.  Using  (III.D.4.12) 
instead  of  directly  integrating  f  (u;1-a)  from  zero  to  |  y  6 1  is 


preferrable,  because  for  a  i  1/2,  f  (0;1-a)  is  undefined.  The  error  in 


( I I I . D . 4 . 1 2 )  from  using  the  trapezoidal  rule  approximation  is 
hi  J  th 

approximately  A2f  (i)  in  the  i  subinterval.  Even  though  there 

n  ' 

are  over  400  subintervals,  the  second  differences  A2f  (i)  are  very  much 

e 

smaller  for  a  £  1/2  in  the  interval  [|y6|,M], 

ii)  A  second  measure  of  P ( 0< e^< | y 6 1 )  is  the  lower  sum 

P2(°  <  en  <  l  y 6 1  )  -  j  y 5 1 f e  { (y 6) ; 1 -a} ,  (III. D. 4. 13) 

since  P(0  <  en  <  | y 6 1  )  is  always  at  least  as  large  as  (III.D.4. 1 3) .  Our 

approximation  for  P(0  <  <  | y 6 }  )  is  the  maximum  of  (III.D.4.  12)  and 

( III.D. 4. 1 3) .  We  use  the  maximum  because  P1  given  by  (III.D.4. 12)  could 

be  negative  when  | y 6 j  is  close  to  zero.  This  follows  because  F  (u;1-a) 

en 

is  strictly  decreasing  for  u  >  0,  and  thus  the  trapezoidal  rule  over¬ 
estimates  the  integral  in  ( III.D. 4. 1 2) . 

E.  PARAMETER  ESTIMATION  IN  THE  BELAR( 1 )  PROCESS 
1 .  Introduction 

In  this  section,  we  develop  estimators  for  the  parameters  in  the 
3ELAR(  1 )  process  and  report  results  on  properties  of  these  estimators 
obtained  from  analytical  comparisons  and  simulations.  We  examine 
estimators  for  the  location  parameter,  y ,  and  the  scale  paramter,  A, 

of  the  series  (X^l;  the  parameter,  a,  of  the  random  coefficient 

1  /2 

(ct,1-a);  and  Y ,  the  lag-1  serial  correlation,  which  is  a  monotone 
f unct ion  of  a. 
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The  theory  of  conditional  least  squares  estimation  for  the 

BELAR( 1 )  process  using  the  linearized  residual  is  derived  using  results 

from  Nicholls  and  Quinn  [Ref.  16].  We  give  a  corollary  to  their  Theorem 

3.1  pertaining  to  the  strong  convergence  and  asymptotic  Normality  of  the 

least  squares  estimator  of  Y,  the  lag-1  serial  correlation.  An  estimate 

1  /2 

for  a  is  derived  using  the  fact  that  Y  =  E{An  (a,1-a)}.  Also,  we  show 
that  the  joint  least  squares  estimator  of  location  and  correlation  for 
the  BELAR(I)  process  is  the  same  as  for  the  linear  AR(1)  processes. 

Other  estimators  of  lag-1  serial  correlation  in  the  BELAR(I) 
process  are  derived  using  the  ideas  of  robust  estimation  of  Huber 
[Ref.  37]  and  least  absolute  deviation  (LAD)  estimation  as  applied  to 
ordinary  linear  autoregressive  models  by  Denby  and  Martin  [Ref.  38]  and 
Bloomfield  and  Steiger  [Ref.  39].  Although  these  estimators  are 
consistent  and  asymptotically  unbiased  in  linear  models,  for  the  random 
coefficient  models  the  results  of  the  simulation  study  show  that  they 
have  a  bias  that  does  not  go  to  zero  asymptotically. 

The  maximum  likelihood  estimator  of  a,  is  found  using  an 

iterative  technique  with  the  initial  estimate  being  derived  from  the 
least  squares  estimate  of  serial  correlation,  Y  c . 

Many  of  the  simulations  comparing  the  different  estimators  are 
conducted  within  the  framework  of  SIMTBED  [Ref.  15].  From  the  Summary 
Statistics  table  generated  by  SIMTBED  for  each  estimator,  it  is  possible 
to  draw  conclusions  concerning  the  bias,  the  variance  at  different 
subsample  sizes,  the  asymptotic  variance,  and  how  fast  the  estimator 
approaches  asymptotic  Normality.  In  the  SIMTBED  program,  one  can 
specify  the  total  number  of  samples  examined  at  each  subsample  size. 
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The  total  number  of  samples  used  is  the  product  of  three  parameters,  N, 
M,  and  NSR.  Three  combinations  of  these  parameters  were  used.  Table 
III. E. 1.1  is  a  summary  of  the  number  and  types  of  subsample  sizes,  N, 
and  the  number  of  independent  repetitions,  M,  of  each  type  of  simulation 
conducted  using  SIMTBED. 

TABLE  III. E. 1.1 
Summary  of  SIMTBED  Types 


Number  of 


Subsample 

Sizes 

(N) 

Super 

Replicati 

Type 

25 

50 

75 

100 

125 

175 

250 

500 

(NSR) 

I 

2000 

1000 

660 

500 

400 

280 

200 

100 

5 

II 

4000 

2000 

1330 

1  000 

800 

570 

400 

200 

10 

III 

8000 

4000 

2660 

2000 

1600 

1 1  40 

800 

400 

1  0 

Each  entry  in  a  Summary  Statistics  table,  which  is  the  output  of 
SIMTBED  after  super  replication,  is  a  pair  corresponding  to  a  mean 
(average  over  the  number  of  super  replications,  5  or  10)  and  an 
estimated  standard  deviation  of  that  mean  value.  From  Table  III. E. 1.1, 
it  is  clear  that  a  large  number  of  independent  realizations  was  used  in 
the  computation  for  each  super  replication  and  the  different  subsample 
sizes.  Because  of  this,  subsequent  tests  of  hypothesis  that  we  use  on 
the  simulation  outputs  will  be  t-tests  on  the  mean  of  a  random  sample  of 
size  5  or  10  drawn  from  a  Normal  population  where  o2  is  unknown,  but  is 
estimated  from  the  sample. 

Before  describing  each  estimator  and  simulation  experiment,  it 


is  convenient  now  to  summarize  the  conclusions  of  this  investigation 
into  the  estimation  of  parameters  in  the  BELAR(I)  process: 


The  simulation  results  from  SIMTBED  indicate  that  both  the  sample 


median  and  sample  mean  are  asymptotically  Normal  estimators  of  p. 
The  asymptotic  variance  of  the  sample  mean  is  approximately  twice 
that  of  the  sample  median  across  all  values  of  the  correlation 
coefficient,  Y. 

The  simulation  results  from  SIMTBED  also  indicate  that  the  mean 
absolute  deviation,  given  in  (II.E.3.2),  is  an  unbiased  and 
asymptoti  .lly  Normal  estimator  of  the  scale  parameter,  It 
also  has  the  smallest  asymptotic  variance  of  the  three  estimators 
considered . 

The  least  squares  estimator  of  Y,  the  lag-1  serial  correlation  is 
asymptotically  unbiased  and  Normally  distributed.  Simulation 
results  support  this  conclusion. 

Simulation  of  other  estimators  of  lag-1  serial  correlation  based 

on  non-linear  residuals  of  the  form  R  *  X  -YX  ,  +  Bf(X  ,X  ) 

n  n  n- 1  n  n- 1 

indicates  that  the  value  of  (Y,B)  that  maximizes  the  sum  of 
squares  of  Rn  is  approximately  (Y^,0). 

Robust  estimators  of  serial  correlation  based  on  certain  symmetric 
loss  functions  of  the  linear  residual  (other  than  the  sum  of 
squares)  are  biased  and,  apparently,  asymptotically  biased. 
SIMTBED  outputs  of  the  Huber(c),  rank  and  LAD  estimators  of  lag-1 
serial  correlation  clearly  exhibited  this  result. 

The  maximum  likelihood  estimator  of  Y,  the  lag-1  serial 
correlation  was  computed  by  the  iteration  scheme  given  in  Section 
III.E.6  for  simulated  data  from  the  BELAR(I)  process.  Results  of 
the  simulation  appear  to  indicate  that  the  estimator  is  converging 
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to  a  Normal  distribution  with  a  mean  value  equal  to  the  true  Y. 
In  comparison  to  the  least  squares  estimator,  the  simulation 
results  indicate  that  the  maximum  likelihood  estimator  has  a 
smaller  variance  and  bias  at  all  values  of  Y. 

2.  Estimators  of  Location 
a.  Introduction 

The  sample  median,  m,  and  the  sample  mean,  X,  are  two 
commonly  used  estimators  of  the  location  parameter,  u,  in  a  stationary 
process  with  a  symmetric  marginal  distribution.  The  sample  median  is  a 
particularly  attractive  alernative  to  X  when  the  symmetric  distribution 
is  also  thick-tailed.  (It  is  well  known  that  for  i.i.d.  processes  with 
a  double  exponential  marginal  distribution  that  the  sample  medi  n  is  the 
maximum  likelihood  estimator  of  u) . 

For  i.i.d.  processes,  it  is  well  known  (Dudewiez,  [Ref.  ^0: 

p.  221])  that  X  has  an  asymptotically  Normal  distribution,  N(0,/oy/n). 

Likewise,  m  is  asymptotically  Normal,  N { 0 ,  A  Anf  * (x  ) } .  The  results 

for  the  sample  median  hold  provided  fv(x  c)  is  continuous  in  a 

x  .  b 

neighborhood  around  x  c,  is  positive,  and  is  bounded  above. 

.  b 

The  problem  of  estimating  u  from  dependent  data  is  more 
difficult.  Analytical  results  exist  about  the  limiting  distribution  for 
X  in  ergodic  processes  and  for  the  sample  median  for  processes 
satisfying  certain  mixing  conditions.  (Mixing  processes  are  those  for 
which  random  variables  "sufficiently  far  apart"  are  approximately 


independent )  . 


Since  the  BELAR(I)  process  is  an  RCA(1)  process  with  i.i.d. 
innovation  and  random  coefficient  processes,  {X^},  is  ergodic  (Nicholls 
and  Quinn  [Ref.  16:  p.  371).  Therefore  X  is  still  an  unbiased 
asymptotically  Normal  estimator  of  p,  but  the  variance  is  modified  oy 
the  factor 

00 

1+2  l  Yk  =  (1+Y)/(1-Y).  (III. E. 2.1) 

k=1 

See,  for  example,  Priestly  [Ref.  33:  p.  3^31* 

The  problem  of  estimating  the  median  has  been  studied  for 
cases  where  the  data  are  dependent.  From  Heidelberger  and  Lewis 
[Ref.  41)],  we  have  that  the  usual  order  statistic  point  estimate 
(sample  median)  is  still  valid,  but  the  variance  is  modified  by  a 

factor,  p(x  ).  Here  p(x  )  is  the  initial  point  on  the  spectrum  of  the 

•  j  •  y 

binary  process  {I  (x  _)},  where 
n  .  o 

1  if  X  <  x, 

I  (x)  =  '  n  (III. E. 2. 2) 

1  0  otherwise. 

That  is 

n 

p ( x  _)  =  lim  n  Var  {  y  I.(x  _)/n}.  (III. E. 2. 3) 

.o  i  =  i 

As  was  already  pointed  out,  conditions  for  convergence  and 
Central  Limit  Theorems  for  the  sample  median  depend  on  mixing 
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conditions.  There  are  several  kinds  of  mixing  conditions.  It  is  not 
known,  however,  if  the  BELAR(I)  process  satisfies  any  of  them. 

However,  the  L A R ( 1 )  process  does  satisfy  the  mixing 
conditions  of  Gastwirth  and  Rubin  [Ref.  14],  Thus,  for  the  LAR(1) 
process,  it  is  known  that  the  sample  median  has  an  asymptotic  Normal 
distribution  with  mean  zero,  and  variance  given  by 

l  {Y^cosh(x  +  sinh(x  (III.  E.  2. 4) 

Gastwirth  and  Rubin  [Ref.  14]  showed  that  for  the  LAR(  1 ) 
process,  the  asymptotic  variance  of  X  is  twice  that  of  the  sample  median 
across  all  values  of  serial  correlation. 

The  question  here  is,  what  are  the  properties  of  the  sample 
median  in  estimating  y  from  data  of  the  BELAR( 1 )  process?  Also,  how  does 
the  sample  median  compare  to  X  in  the  BELAR(I)  process? 

Since  {X  }  from  both  the  BELAR(I)  and  the  LAR(1)  processes 
n 

have  a  marginal  Laplace  distribution  and  first-order  autor egr essi ve 
correlation  structure,  the  hypothesis  is  that  the  sample  median  from  the 
BELAR( 1 )  process  behaves  similarily  to  that  generated  from  data  in  the 
LAR(  1  )  process.  Also,  the  relative  efficiency  of  m  to  X  is  the  sam°  in 
the  two  processes. 

To  substantiate  this  assumption,  the  sample  median  and 
sample  mean  were  compared  in  simulation  experiments  in  SIMTBED  for  data 
generated  from  the  BELAR(l)  process.  The  simulation  output  is  compared 
to  the  theoretical  results  for  the  LAR(l)  process. 


b.  Simulation  Results 

For  a  =  .1  and  a  corresponding  correlation  coefficient  of 
Y  -  .  17664,  the  estimators  X  and  m  were  simulated  in  SIMTBED  using  a 
size  of  Type  III  from  Table  III.  E.  1.1.  The  results  are  given  in  the 
Summary  Statistics  in  Table  III. E. 2.1.  Looking  at  Table  III. E. 2.1  for 
N  =  100  and  greater,  there  is  no  evidence  of  ncn-Normal ity  from  the 
first  four  estimated  moments  of  the  sample  mean.  The  leading 
coefficient  in  the  asymptotic  expansions  for  E(X)  and  Var(X)  do  not 
deviate  significantly  from  the  theoretical  values,  i.e.  X  is  unbiased 
and  Var (X)  =  2.8581  /N. 

Looking  at  Table  III. E. 2. 2,  the  Summary  Statistics  at  a  =  .  1 
for  m,  it  appears  that  even  for  N  =  25,  m  is  unbiased  and  the  sample 
skewness  is  fluctuating  about  zero.  The  variance,  however,  at  each 
subsample  size  up  to  N  =  250  deviates  significantly  from  a  hypothetical 
asymptotic  variance  of  1  .4291  /N,  the  corresponding  result  for  LAR(  1 )  . 
This  is  explained  by  the  kurtosis  of  the  estimate  m  of  the  median  which, 
although  decreasing  with  increased  subsample  size,  is  still 
significantly  different  from  0  until  N  =  250.  The  leading  coefficients 
in  the  expansions  for  the  expectation  and  for  the  variance  are  not 
significantly  different  from  0  and  1.4291  respectively.  Since  the  data 
are  only  slightly  correlated,  we  could  have  expected  the  sample  median 
to  behave  similarily  to  that  of  the  case  of  the  completely  random 
process  with  Laplace  marginals,  i.e.  m  is  unbiased,  asymptotically 
Normal ,  and  has  a  variance  with  leading  coefficient  1/n. 
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REGRESS  I  OB  ON  VARIANCE  -  COEEflCIENTS:  H8IU  'WJKI 


For  values  of  a  »  .5  arid  .844,  with  corresponding  Y  =  .63662 
and  .89986,  using  Type  II  experiments  as  described  in  Table  III. E. 1.1, 
we  again  compared  the  behavior  of  X  and  m. 

From  Tables  III. E. 2. 3  and  III. E. 2. 4,  we  see  that  the 
behavior  of  X  is  as  expected.  The  sample  mean  appears  to  be  unbiased. 
For  N  £  250,  there  is  no  evidence  of  non-Normal  i ty .  The  estimates  of 
the  leading  coefficient  in  the  asymptotic  expansions  for  the  variance 
agree  within  one  standard  deviation  of  the  postulated  values  of  9.0  and 
38. 

The  corresponding  results  for  m  are  given  in  Tables 
III. E. 2. 5  and  III. E. 2. 6.  The  sample  median  shows  no  bias  and  appears  to 
be  asymptotically  Normal  after  N  £  250.  In  each  case  (a  =  .5  and 
a  =  .844)  the  leading  coefficient  in  the  expansion  for  the  variance  is 
smaller  than  the  corresponding  value  for  the  variance  of  the  sample 
median  in  the  LAR( 1 )  process,  i.e.  4.5  and  19  respectively. 

The  analysis  thus  far  has  indicated  that  at  least  for  data 
with  non-negative  correlation  in  the  BELAR(I)  process,  there  is  little 
evidence  to  suggest  that  the  behavior  of  the  sample  median  is 
significantly  different  than  in  the  LA R (  1  )  process.  From  Table 
III. E. 2. 7,  we  see  the  same  kind  of  results  that  Gastwirth  and  Rubin 
[Ref.  14]  reported.  As  sample  size  increases,  the  efficiency  of  X 
relative  to  m  drops  to  50$. 
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TABLE  III. E. 2. 7 


Efficiency  of  X  Relative  to  m  in  BELAR(l)  for  Y  >  0 


N 

Y  =  +. 1 766 1 

Y  =  +.63662 

Y  =  + 

25 

.64 

.69 

.98 

50 

.58 

.58 

.81 

75 

.55 

.55 

.73 

1  00 

.54 

.52 

.67 

1  25 

.52 

.50 

.62 

175 

.53 

.49 

.57 

250 

.51 

.47 

.53 

500 

.50 

.47 

.48 

1.  For  Y  -  +.1766  the  results  are  based  on  a  Type  III  experiment. 
For  the  other  two  cases,  the  results  are  based  on  Type  II 
experiments . 


We  also  simulated  X  and  m  for  negatively  correlated  data 
from  the  BELAR( 1 )  process.  Type  III  simulations  were  used  for  X  and  m 
at  Y  *  -.63662  and  a  Type  II  simulation  for  X  at  Y  =  -.9.  From  the 
Summary  Statistics  for  X  in  Tables  III. E. 2.8  and  III. E. 2. 9,  we  see  X  is 
unbiased  and  approximately  Normal  for  sample  sizes  greater  than  125. 
Estimates  for  the  coefficients  for  the  asymptotic  variance  are  not 
significantly  different  from  the  theoretical  values  of  .4441  and  .1053. 

From  Table  III. E. 2. 10,  the  most  obvious  point  to  be  made  is  that 
even  for  moderately  negatively  correlated  data,  m  is  not  Normally 
distributed  even  for  subsamples  of  size  500.  The  sample  median  is 
unbiased,  but  the  kurtosis  is  not  decreasing  fast  enough.  The  variance 
of  the  sample  median  even  at  N  =  500  is  almost  certainly  not 
( 1 /N )  ( 1  +Y/1  - Y)  .  However,  the  leading  coefficient  in  the  expansion  for 


the  asymptotic  variance  is  within  a  standard  deviation  of  the 
hypothetical  values  ( 1  /N)  ( 1  +Y/1  -Y) .  This  would  indicate,  for  the  case 
of  negative  correlation,  a  much  slower  convergence  of  the  sample  median 
to  Normality  than  for  positively  correlated  data. 

For  negatively  correlated  data  from  the  BELAR( 1 )  process,  we 
have  observed  that  X  does  not  lose  efficiency  relative  to  m  as  fast  as 
for  non- negatively  correlated  data.  In  fact,  from  Tables  III. E.  2.8  and 
III. E. 2. 10,  it  is  clear  that  the  variance  of  X  is  smaller  than  m  for 
subsample  size  N  S  100. 

3.  Estimators  of  Scale 
a.  Introduction 

In  the  case  of  estimating  the  scale  parameter,  X,  we 
considered  three  estimators.  Since  VarCX^)  =  2X2,  we  considered 


A1  =  S//  2  where 


s‘  -  i  I  (x  -  x>!. 

1=1 


(III. E. 3.1) 


Since  the  maximum  likelihood  estimator  of  X  for  an  i.i.d.  sample  with 
marginal  Laplace  distribution  is  the  sample  mean  absolute  deviation 
about  the  median,  we  set 


x2  *  ii  A  ixi "  mi- 


(III. E. 3. 2) 


.\.r  .**  fc' 


SIMTBED  Summary  Statistics  for  Estimating  y  by  X  in  the 


REGRESSION  OR  VARIANCE  -  COEFFICIENTS: 


<11 

mm 

m*» 

ev 

M»- 

M 

•-Pi 

Ml 

-M 

r*M 

r^l 

run 

•UN 

rUM 

N 

o*- 

mm 

OO 

OO 

oo 

o 

oo 

o 

OO 

oo 

OO 

OO 

oo 

OO 

OO 

oo 

OO 

O 

yy 

WW 

l-M.I 

M 

blii 

Ui 

l|M.I 

yy 

yy 

Ui 

O0V 

MN 

MO 

oo 

inn 

OO 

mm 

*n 

p-<*4 

90 

MW- 

r-O 

JTN 

MIO 

m 

0M 

too 

mm 

PM- 

on- 

rdl 

OO 

Mr- 

r^- 

MOV 

or- 

ON 

r- 

mo 

tom 

<00 

am- 

OO 

PM- 

run 

mm 

— o 

PA 

OO 

Or- 

(-« 

r*P i 

•rl 

M 

•**» 

mm 

MM 

mm 

*-o\ 

am 

4O0C 

sot*- 

MO 

400 

MM 

oo 

•*- 

0i^ 

•UN 

e 

i 

OO 

oo 

1 

oo 

o° 

80 

oo 

1 

8° 

1 

1 

OO 

OO 

OO 

OO 

oo 

M 

o 

01 

1 

>1  (N 

mn 

ON 

mm 

•-M 

NN 

N 

Pi 

run 

— M 

MM 

trasn 

run 

•-Pi 

M 

N 

NN 

OO 

OO 

OO 

oo 

o 

OO 

O 

o 

oo 

OO 

OO 

OO 

OO 

oo 

oo 

O 

O 

Mr- 

F—« 

iQ 

M 

1 

«  1 

1  l 

1  1 

1  1 

1  1 

t  1 

1 

NN 

V0T- 

ro 

X 

MW 

Ui 

Ui 

Ui 

l.M.J 

yy 

lllll 

WUi 

Ui 

Ui 

Nm 

OM 

p-m 

P-M 

MM 

e*m 

PM- 

r*m 

r-^ 

MM 

Or- 

PMH 

rUO 

«nm 

0*40 

vO 

Of*- 

M«— 

run 

MO\ 

MO 

0M 

run 

MO 

PIO 

9W 

PA 

P4 

<7*0 

OM 

Iff- 

'ON 

-MO 

run 

40r- 

Jt- 

OM 

MP4 

Pf- 

mas 

run 

PA 

oo 

oo 

M 

nr 

a^ 

i 

c 

1 

It 

w 

OO 

OO 

OO 

OO 

OO 

do 

oo 

OO 

OO 

do 

OO 

OO 

oo 

do 

oo 

-p 

5- 

w 

* 

1 

* 

1 

rfl 

o 

g 

MM 

run 

PM- 

P4N 

P* 

run 

run 

MM 

run 

^M 

M 

M 

Pi 

H 

•rl 

c 

OO 

oo 

o 

OO 

O 

o 

O 

OO 

l  t 

OO 

OO 

oo 

OO 

1  1 

o 

O 

o 

1 

MM 

oo 

pm 

oo 

4J 

03 

Ui 

Ui 

Ui 

Ui 

UJUi 

upy 

UI 

Ui 

Ui 

i  i 

me* 

oo 

OOi 

mm 

901 

MN 

P40I 

MIO 

Mm 

MO 

PA 

UAa 

O 

PM- 

r-M 

OM 

ON 

MM 

mm 

P*M 

P*M 

Mur 

*-m 

MO. 

Pi 

•rp 

W 

in 

o 

r— o 

<7*r\ 

CNM 

som 

mm 

*“© 

mm 

nm 

MO 

PMD 

rw 

—ON 

n» 

V040 

oo 

w 

MM 

mm 

M 

mm 

r0 

MO 

Ml 

•UN 

r-p 

u 

II 

o 

OO 

oo 

do 

oo 

OO 

OO 

OO 

OO 

OO 

oo 

oo 

OO 

oo 

Or 

oc 

H 

o 

a 

i 

1 

M 

4-1 

do 

dc 

M 

-C 

run 

MN 

p« 

«-M 

run 

run 

M 

Pi 

W 

■p 

w 

OO 

oo 

OO 

o 

OO 

Ca 

o 

o 

OO 

OO 

OO 

oo 

oo 

O 

o 

o 

w 

o 

■H 

yy 

yy 

Ui 

yy 

Ui 

Ui 

yy 

l-1^. 

yy 

w 

Ui 

Ui 

n 

5 

MM 

0-“ 

MO 

mas 

«-o 

OO 

MM 

p*m 

P0 

P*0 

Mr- 

Or- 

OM 

•UN 

irm 

nut 

M 

m 

iff- 

MM 

om 

o» 

P-m 

«-N 

run 

O— 

rrr 

run 

■p 

o 

r- 

OO 

0m 

PM- 

oov 

OM 

0*n 

OO 

M^ 

40M 

0®l 

r— 0 

•V 

f-O 

MM 

NM 

0m 

« 

(/} 

01 

MJT 

r-e* 

P4N 

«xr 

MP* 

MM 

rUJV 

©•n 

run 

PA 

0m 

r*«B 

rfl 

" 

H 

01 

to 

OO 

OO 

OO 

oo 

OO 

do 

oo 

OO 

oo 

OO 

OO 

OO 

OO 

OO 

P 

<D 

' 

• 

t 

i 

1 

1 

• 

• 

1 

<TJ 

U 

< 

P 

o 

H 

M 

MM 

e*m 

_ 

NN 

Pi 

M 

M 

M, 

MM 

M 

M 

N 

Pi 

o 

U 

ftft 

oo 

oo 

o 

OO 

o 

o 

o 

OO 

OO 

oo 

o 

o 

o 

O 

1^ 

cu 

1  1 

1  1 

i 

1  1 

1  1 

1 

« 

u- 

liM 

yy 

Ui 

Ui 

Ui 

Ui 

UiW 

Ui 

Ui 

Ui 

Ui 

Ui 

>1 

NM 

OCX 

MO 

OO 

mm 

M0 

Nr 

OM 

xun 

O 

i- 

1 

\  o 

MM 

m*- 

OP- 

400 

oo 

NO 

*-a 

oo 

r-o 

PV 

P*Oc 

MNO 

Pro 

0VM 

O 

X 

3 

M\ 

M 

90 

am 

MPi 

OM 

040 

OlO 

MO-N 

P*0> 

ON 

m»« 

ui 

03 

rH 

MM 

00 

or* 

r*m> 

NN 

PM- 

run 

— ' CP 

mo 

MM 

KM 

run 

r0 

or- 

PN» 

1 

¥5 

* 

U 

p 

a 

OO 

oo 

oo 

oo 

OO 

oo 

OO 

do 

do 

oo 

OO 

OO 

OO 

OO 

OO 

OO 

Ui 

u- 

0} 

< 

pi 

0 

< 

X 

u- 

Ui 

O 

MM 

M 

Pi 

Pi 

Pi 

mm 

m 

Pi 

N 

Pi 

Pi 

Ui 

u 

Q 

CQ 

OO 

o 

1 

oo 

•  1 

o 

1 

oo 

1  I 

o 

1 

o 

1 

O 

O 

i 

OO 

1  1 

OO 

1  i 

OO 

1  1 

o 

» 

o 

o 

1 

o 

1 

> 

< 

1 

W 

WUi 

<MT 

P-M 

M 

400 

Ui 

p*l 

Ui 

run 

Ui 

rov 

Ui 

Uhl 

fuOl 

PP0 

NM 

Ui 

PA 

X 

X 

CQ 

0*0 

ON 

oio 

PNM 

mm 

o® 

r-o 

4UM 

o«n 

mm 

MUD 

ma\ 

asm 

rufi 

OM 

o 

o 

E-* 

(N 

MN 

Mm 

or\ 

ru— 

<v— 

r*oi 

no 

400 

or- 

ON 

4 OM 

O0I 

M0* 

*-o 

«-*o 

PM 

(Nr 

mm 

o 

M 

oo 

oo 

o 

oo 

do 

do 

do 

do 

do 

oo 

OO 

do 

OO 

Ui 

00 

V) 

0 

Ui 

yj 

X 

o 

X 

Ui 

Ui 

¥» 

X 

o 

-J 

M 

M 

Ui 

ft. 

V) 

X 

w 

> 

z 

M 

M 

o 

o 

Ui 

< 

X 

o 

u 

n 

o 

o 

O 

o 

o 

o 

a 

VXU 

X 

2 

X  — 

n 

in 

o 

M 

M 

o 

M 

r- 

0V 

OOTM 

< 

Ui 

X 

x’ 

<  o 

o 

o 

PI 

Ml 

r— 

01 

0v 

0i 

< 

3— 

y 

f- 

X 

o 

Ui 

a 

MM 

X 

¥» 

1/5 

X 

CO 

Or  o 

o 

o 

o 

o 

o 

o 

o 

o 

z 

w» 

REGRESSION  ON  VARIANCE  -  COEE F ICI ENTS:  8l i^SSS^E 


As  a  third  alternative,  we  chose  the  scaled  median  absolute  deviation 
about  the  median, 


~  |X.  ~  m 

med[  1  1 

3  =  i  (  .6931  5 


(III.E.3.3) 


The  scaled  median  absolute  deviation  is  a  frequently  used  robust 
estimator  of  scale  [Ref.  38].  In  the  simulations,  we  assumed  that 
are  Laplace  with  median  =  mean  =  0  for  all  n.  Table  III. E. 3.1  contains 
a  summary  of  the  type  simulation  (as  defined  in  Table  III.  E.  1.1),  the 

AAA 

estimator  U1 ,  X2>  X^)  and  the  values  of  a  and  Y  that  were  used. 

TABLE  III. E. 3.1 

Summary  of  Simulation  Schedule  for  Estimators  of  X 


Y 

-.89986 

.17664 

.63662 

a 

.844 

.1 

.5 

Estimator 

*1 

Type  II 

Type  III 

Type  I 

*2 

Type  II 

Type  III 

Type  I 

*3 

Type  II 

Type  III 

Type  I 

b.  Simulation  Results 

In  the  Type  III  simulation  (See  Tables  III. E. 3-2  - 
III. E. 3.1*),  using  slightly  correlated  (Y  =  .17664)  realizations  of  the 
BELAR(I)  process,  we  found  the  best  estimator  of  X  to  be  the  sample 
mean  absolute  deviation.  It  appears  to  be  unbiased  for  all  subsample 


sizes.  The  skewness  and  kurtosis  are  decreasing  with  increased  sample 


<u 

+J 


u 


w 

J 

§ 


<  «-< 

•^r 

>1  & 
A  'sD 
r- 

<<  rH 

g'i- 

•H 

•P  TJ 
<0  C 

e  <o 

•pH 

4J  rH 
0)  • 
U  I! 

O  JS 

U-I  4-> 

■H 

V)  S 

o 

•h  to 

4J  co 
co  a) 

-H  O 


0 

U 

04 


>1  r~f 

U 

rtj  a: 

i  3 

3  :j 
m  cq 

c 

w 

m 

e- 

x 


regression  on  variance  -  coee f ic i ENis:  o ’ i S8?39  18:818?  '?8:3?8J 


REGRESSION  ON  VARIANCE  -  COEE  f  ICI  ENTS:  'HlfJJ  SifMIi 


sizes.  But  even  for  N  »  500,  the  skewness  is  still  significantly 


different  than  0.  Using  two-sided  t-tests  with  18  degrees  of  freedom 
for  the  equality  of  means  of  two  Normal  populations  with  unknown 
variances  at  the  90$  confidence  level,  we  reject  each  of  the  hypotheses 

A  /S  A  A 

independently  that:  (1)  Var(X^)  =  Var(A2)  and  (2)  VarCA^)  =  Var(A^). 

A  A  A 

The  mean  relative  asymptotic  efficiency  of  X^  and  A^  to  A^  are  estimated 
from  the  regression  on  variance  coefficients  to  be  76$  for  A1  and  60$ 
for  A  y 

Ak  A 

Both  A  and  A^  appear  from  the  simulation  to  be  biased. 
From  the  second  coefficient  in  the  mean  of  regression  on  average  in 

a 

Table  III.E.3«2,  A^  appears  to  have  a  negative  bias  of  order(1/N).  From 

a 

Table  III.E.3.1*  it  appears  that  A^  has  a  positive  bias  of  order (1/N). 
However,  since  the  leading  term  in  the  expansion  of  the  mean  for  both 
estimators  is  the  true  value  of  Y,  it  appears  that  both  X^  and  A ^  are 
asymptotically  unbiased. 

When  we  considered  moderately  to  highly  correlated  data  (see 
Tables  III. E. 3.5  -  III. E. 3-10),  the  differences  in  the  behavior  of  the 
estimators  were  not  as  easy  to  discern.  The  particular  bias  of  A^  and 
A ^  was  even  more  apparent,  especially  at  the  smaller  subsample  sizes. 
As  | y|  increased,  so  did  the  expressions  for  the  asymptotic  variances. 
At  each  of  the  subsample  sizes,  in  both  types  of  correlation,  A^  had  the 
highest  estimated  variance.  The  variance  of  A^  was  significantly 
different  than  that  of  A^  at  all  levels  of  significance  and  subsample 
sizes  up  to  N  =  500.  However,  we  could  not  reject  that  the  asymptotic 
variances  of  A^,  A 2  and  A^  were  the  same  at  each  of  the  two  levels  of 
correlation. 
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SIMTBED  Summary  Statistics  for  Estimating  X  by  X2  in  the 
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ESTIMATOR:  SAMPLE  MEDIAN  ABS  DEV;  LMDA-1.0 
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4.  Least  Squares  Estimation  of  Serial  Correlation 

In  tils  section,  it  is  assumed,  unless  otherwise  stated,  that 
has  a  standard  Laplace  (p  =  0,X  =  1)  distribution.  If  not,  standardize 

Xn  by 


X* 


n 


(III. E. 4.1) 


where  p  and  X  will  be  specified  from  those  estimators  already  discussed 
in  III.E.2  and  IiI.E.3. 

The  least  squares  estimator  of  the  lag-1  serial  corrlation,  Y,  g , 
is  derived.  First,  we  show  that  the  BELAR(I)  process  is  an  R CA ( 1  ) 
process  of  Nicholls  and  Quinn  [Ref.  1 6 ] .  Then,  we  define  the  linearized 
resioual  in  the  BELAR(I)  process  and  state  some  of  its  properties.  From 
these  properties  and  some  results  from  Nicholls  and  Quinn  for  RCA 
processes,  we  derive  the  asymptotic  properties  of  Y  „ .  The  properties 
of  Y^g  are  ob  erved  also  in  the  simulation  results  for  selected  values 
of  Y.  Finally,  the  joint  least  squares  estimator  of  location  and  serial 
correlation  are  derived  por  the  BELAR( 1 )  process. 

Rewriting  (III. D. 1.1)  by  adding  and  subtracting  YX  ,  we  have 

xn  =  XXn_1  ♦  {AjIi/2(a,1-a)  -  Y}Xn-1  +  en,  (III. E. 4. 2) 

where  Y  =  E{A^/2(a,1-a)}  as  given  by  (III. C. 2.  3)  for 

1 12 

l  =  'i(An  (a,1-a)  -  Y}  is  an  i.i.d.  process  stochastically  independent 


of  the  i.i.d.  {c  }.  The  variance  of  the  random  coefficient  is  (a  -  Y2) 


rv; 


for  all  n.  As  can  be  seen  from  (III.C.2.5)  and  the  fact  that  0  <  a  <  1, 

if  we  know  a,  then  we  also  know  |y|  and  vice-versa.  That  is,  in  the 

BELARC 1 )  process,  there  is  only  one  independent  parameter  to  estimate 

for  the  correlation.  Now,  we  recognize  (I1I.E.4.2)  immediately  as  an 

R  CA  (  1  )  process  of  Nicholls  and  Quinn  [Ref.  16].  Since  {e^}  and 
1  /2 

{A  (a,1-o)  -  Y}  are  each  identically  distributed  as  well  as  being 
n 

serially  independent  and  independent  of  each  other,  we  have  by  theorem 
2.7  [Ref.  16]  that  (Xn>  13  the  unique  strictly  stationary  and  ergodlo 
solution  to  (III. E. 4. 2). 

There  arc  two  ways  to  look  at  the  linearized  residual  in  the 
BELAR( 1 )  process  just  as  described  in  Chapter  II  for  the  NLAR(1)  model: 

Rn  =  {  A^/2(a,  1-a)  -  YjX^  ♦  en,  (III. E. 4. 3) 

or 

R  -  X  -  YX  . .  (III. E. 4. 4) 

n  n  n-1 

From  (III.E.4.4),  we  see  that  since  {X^}  is  strictly  stationary,  so  is 

{R  }.  Also,  we  see  E(R  )  =  0  and  Var(R  )  =  2(1-Y2).  Lawrance  and  Lewis 
n  n  n 

[Ref.  22]  proved  that  the  R^  are  uncorrelated ,  but  in  general,  not 

independent.  From  (III. E. 4. 3),  we  note  that  for  any  n,  R^*  en  unless 

a  =*  0 .  Except  for  when  a  =  0  or  1,  VartR^)  >  Var(e  ).  As  o  increases 

from  zero  to  one,  both  Var(R  )  and  Var(e  )  decrease  monotoni  cally  from 

n  n 

two  to  zero.  This  is  evident  from  the  definition  of  Y  in  (III.C.2.5) 
with  i  =  1  . 
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Two  other  properties  of  { R^ }  are  obtained  from  (III. E.  4. 3)  by 

conditioning  on  the  independent,  identically  distributed  processes  (ek) 
1  /2 

and  {Ak  (a, 1-a)  -  Y }  up  to  time  k  =  n  -  1.  We  have 

1  /2 

E[Rn|{ek,Ak  (a, 1-a)  -  Y} ;  k  =  1,2 . n-1] 

1  /2 

=  xn_1E{An  (a  1-a)  -  Y}  +  E(e  )  =  0,  (III. E. 4.5) 

1  /2 

because  { A^  (a, 1-a)  -  Y}  and  are  independent  of  the  process  through 
time  n-1  and  X  .j  is  a  function  only  of  the  process  through  n-1. 

E[R*|Uk»  A  )<  -  Y};  k  =  1,2 . n-1] 

-  E( e*)  +  x*_1E[{A^/2(a,1-a)  -  Y}2] 

=  2(1-a)  +  x2_ i (a-Y2 ) ,  (III. E. 4. 6) 

which  is  only  a  function  of  a  or  Y2  alone,  since  a  determines  Y2  and 
vice-versa . 

Now  using  (III. E. 4. 4)  and  a  given  realization  of  {X^}  of  size  n, 
n 

we  minimize  i  R?  with  respect  to  Y  to  obtain  the  conditional  least 
i-2  1 

squares  estimate  for  Y.  This  is  the  same  procedure  as  described  for  the 
NLAR(1)  process.  We  have 

-  n  n 

Y, „  =  (  l  x,x,  .  )  /  (  y  x?  .  1  .  (III. E. 4. 7) 


Two  problems  can  occur  using  (III.  E.  4. 7),  especially  for  small 
sample  _izes.  For  the  BELA R ( 1 )  process  defined  by  (III. E. 4. 2), 
1  £  Y  £  0,  and  yet  it  is  possible  that  Y^s  <  0  or  |Y^S[  >  1  .  If 

-1  <  Y.  <  0,  we  would  estimate  that  the  sample  {X  }  came  from  the 

La  O  n 

1  /2  * 

BELAR( 1 )  process  with  the  negative  sign  on  A^  (a,1-a).  If  | Y |  >  1  ,  we 
would  estimate  Y  by  +1  or  -1. 

In  order  to  obtain  the  "least  squares"  estimate  for  a,  we  solve 
numerically  for  a^s  in 


r(SLS*1/2) 

r(aLS:> 


(III. E. 4. 8) 


for  a  given  Y^g  from  (5.7)  if  |  Y^s  |  <  1. 

A 

The  estimator  Y^s  given  by  (III.  E.  4.7)  has  the  following 
properties  which  we  state  as  a  corollary  to  Theorm  3-1  [Ref.  16]: 

CORROLLARY.  For  {Xn}  given  by  (III  E.4.2);  {Rn)  in  (III. E.  4.  3) 
and  (III. E. 4. 4),  the  least  squares  estimator  Y^s  has  the  following 
properties : 


b)  Since  E(X“)  =  24  <  ®,  ( N- 1 ) 1 ( Y  -Y)  has  a  distribution 
n  Lo 

which  converges  to  the  Normal  with  a  mean  of  zero  and  a  variance  a* 


given  by 


1  +5a-6Y2. 


(III. E. 4. 9) 


o 


2 

Y 


The  proof  follows  from  Theorem( 3. 1 )  •  The  strict  stationarity 
and  ergodic  nature  of  { }  leads  to  the  almost  sure  convergence.  The 
results  of  (III. E. 4.5)  and  (III.  E.  4. 6),  together  with  Billingsley's 
Martingale  Central  Limit  Theorem  provide  the  results  for  the  asymptotic 
Normality  of  Y  g. 

A  strongly  consistent  estimator  for  the  variance,  o2,  is  also 
given  in  [Ref.  16]  for  the  general  R CA ( 1 )  process.  For  o2  in 
(III.E.4.9),  this  estimate  becomes 


(n-1 ) 
n 

V  X2. 


l-c 


i-1 


(1As} 


n 

l  X?  + 
L  i 


(otLS~YLS) 


n 

l  X* 

i-2  x* 


i-1 


n 

l  X2  . 
i-2  1_1 


.  (III. E. 4. 10) 


For  large  n  (III. E. 4. 10)  is  approximated  by 


(1_aLS) 


(n-1)  (a 


LS  LS  '  i-2 


[  X?  . 

,  „  i" 


n 

l  X2  , 
i-2  1‘1 


(III.  E.  4. 11) 


where  Y^  is  from  (III.  E.  4. 7)  and  (III.  E.  4. 8). 

Simulations  of  the  least  squares  estimator  of  Y  were  conducted 
for  selected  values  of  Y  in  SIMTBED  using  Type  III  plans.  The  results 
are  summarized  in  Tables  III. E. 4.1,  III. E. 4. 2  and  III. E. 4. 3.  The 
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results  reflect  the  theoretical  behavior  of  the  estimator  as  derived 
above . 

We  note  that  the  joint  conditional  least  squares  estimators  of  p 

and  Y  in  the  BELAR(I)  process  are  the  same  as  in  the  linear  A R ( 1 ) 

n 

processes.  Minimizing  the  sum  £  R?  where  now 

i-2  1 


R.  =  (X.-p)  -  YtX^-p), 


(III. E. 4. 12) 


leads  to  the  following  joint  estimators  for  p  and  Y 


n  ^  n 

D  -  (  I  X  -  Y  l  X  )  /  (n-1)(1-Y),  (III. E. 4. 13) 

i-2  i-2 


~  n  n 

Y  =.  I  (X  -ti)(X  -&)  /  l  (X.^-p)2.  (III.  E.  4. 14) 

i*2  i-2 


For  large  n  these  equations  reduce  to  the  familiar  ones 


P  =  X 


(III.E.4.15) 


*  n  _  _  n 

Y  =  l  (X  -X)(X  -X)  /  l  (X  -X)2.  (III. E. 4. 16) 

i-2  1  1  i-2  1 


We  now  turn  in  the  next  section  to  the  question  of  alternative 
estimators  for  Y  given  that  p  =  0  and  \  -  1. 
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n 


5.  Other  Estimators  of  the  Lag-1  Serial  Correlation 


a.  Estimators  Based  on  a  Non-linear  Residual 

In  this  section,  we  explore  other  possibilities  for 
estimating  Y  in  the  BELAR( 1 )  process.  There  is  a  question  as  to  why  one 
should  use  the  linear  residual  since  the  BELAR(I)  process  is  a  random 
coefficient  process  which  is  non-linear.  Secondly,  why  should  you 
minimize  the  square  of  the  linear  residual  as  opposed  to  minimizing  some 
other  symmetric  loss  function  which  is  a  function  of  the  linear 
residual?  The  answer  to  both  questions  is  that  the  least  squares 
estimator  of  Y  based  on  the  linear  residual  out- performed  other 
estimators  in  the  simulation  experiment. 

Consider  the  following  types  of  non-linear  residuals 


R*  =  X  -YX  -B(X2-2)  , 
n  n  n-1  n 


R'  =  X  -YX  -6X2  . Sign (X  . ) 

n  n  n-1  n-1  °  n-1 


From  ( III. E. 5.1),  it  follows  that  R^  has  zero  mean  and 


Var (R  )  =  2 ( 1 -Y2  +1  OS 2 )  , 
n 

Cov (R* , R*  , )  =  20a6 2 . 
n  n- 1 


(III.E.5. D 


(III. E. 5. 2) 


(III.E.5. 3) 


( ill. E.5. 4) 


Introducing  the  extra  parameter,  8,  makes  the  residuals,  R  ,  correlated 
unless  a  =  0  or  B  =  0.  If  8  is  zero,  then  we  again  have  the  usual 
linearized  residual  in  ( III.  E.  *4. 4)  .  If  B2  =  Yz/10,  then  the  variance  is 
a  constant,  but  the  residuals  are  still  correlated.  It  is  easy  to 
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compute  the  least  squares  estimators  for  Y  and  6  from  (III.  E. 5.1)  and 
(III. E. 5. 2).  We  simulated  the  estimators  of  Y  and  8  and  compared  them 
to  the  results  based  on  (III. E. 4. 4)  with  8-0.  From  Table  III. E. 5.1, 
we  see  that  the  different  estimators  of  Y  from  all  three  residuals  are 
close  to  the  true  Y.  The  result  is  that  the  estimates  of  8  are  very 
close  to  zero. 

To  see  how  much  the  value  of  Y  could  change  with  8  fixed  at 
some  non-zero  values,  we  simulated  the  least  squares  estimator  of  Y  with 

6  =  0  and  the  estimator  of  Y  based  on  (III. E. 5.1)  with  6  =  Y/  1 0  and 

again  with  8  =  -Y//  10.  From  Table  III. E. 5. 2,  we  see  that  6^0 

severely  alters  the  estimate  of  the  serial  correlation.  Therefore,  in 
the  remainder  of  this  subsection,  we  consider  alternative  estimators  for 
Y  in  the  BELAR(I)  process  to  be  only  those  based  on  the  linear  residual, 
b.  Estimators  Based  on  the  Linear  Residual,  Rn 

Besides  the  asymptotically  unbiased  least  squares  estimator, 
we  considered  the  following  well-known  estimators  of  Y  in  linear  AR(1) 
models: 

1)  The  Huber (c)  function  as  described  by  Denby  and  Martin  [Ref.  38]. 

The  estimator,  Y„,  is  the  value  of  Y  that  satisfies  the 
H 

equation 


I  X  »  <X  -YX  )  -  0, 


(III. E. 5. 5) 


TABLE  III. 3. 5.1 


Simulation 

Results 

for  Various 

Definitions  of  R  in 
n 

BELAR(I) 

1 

.  N  -  500 

a  -  .  5 

Y  - 

C°rr(VXn-1 

)  *  .63662 

DATA 

a 

yls 

8-0 

y 

8 

Y' 

a 

6’ 

XI 

.56891 

0 

.571 92 

.00279 

.62082 

-.01771 

X2 

.61996 

0 

.61630 

-.00815 

.56054 

+  .01637 

X3 

.62651 

0 

.62604 

.00358 

.78189 

-.05808 

X4 

.57995 

0 

.58374 

-.01 865 

.7571  6 

-.07208 

X5 

.59236 

0 

.59233 

-.02100 

.70995 

-.04748 

AVG 

. 59754 

.59807 

-.00829 

.68607 

-.03580 

STD 

.02499 

.02257 

.01 1 54 

.09330 

.03535 

BIAS 

-.03908 

-.03855 

-.00829 

+.04945 

-.03580 

2. 

N  -  1000 

a  =  .  5 

Y  - 

C°rr(VXn-l)  * 

.63662 

DATA 

tls 

ZJO 

II 

O 

Y 

~  * 

8 

Y’ 

a 

8' 

Y1 

.63026 

0 

.62955 

-.00423 

.62985 

.0001 3 

Y2 

.67422 

0 

.65653 

.02520 

.59178 

.03095 

Y3 

.62566 

0 

.62921 

-.00590 

.59646 

.01093 

Y  4 

.67738 

0 

.67777 

.00233 

.60522 

.02359 

Y5 

.64664 

0 

.64784 

-.00560 

.62841 

.00581 

AVG 

.65083 

.64818 

.61034 

.01 428 

STD 

.0241 1 

.02032 

w'-SH  3 

.01782 

.01 273 

BIAS 

+  .01  421 

+.01156 

+.00236 

-.02628 

+  .01  428 

s 


<• 


TABLE  III. E. 5. 2 

Simulation  Results  for  Various  Definitions 

of  R  to  Estimate  Y  Given  $  in  BELAR(I) 
n 


DATA 


N  =  500;  a  -  .5  Y  =  .63662 


(\sl®  *  °> 


1 

.56891 

.27552 

.2741 0 

2 

.61996 

.21515 

.26257 

3 

.62695 

.38621 

.38450 

4 

.57995 

.34356 

.39730 

5 

.59236 

.36082 

.40557 

where 


fH(t) 


t  if  | t |  £  c, 

c  Sign(t)  if  |t  |  >  c . 


(III. E. 5. 6) 


R: 


,v 


I' 


The  corresponding  weight  function  w  (t)  is  4'„(t)/t  and  c  is 

H  n 

/N 

a  tuning  constant.  As  c  goes  to  infinity  ¥„(t)  approaches  t  and  Yu  is 
the  least  squares  estimator  of  Y.  If  c  =  0,  we  have  the  solution  of 
(III. E. 5. 5)  is  the  median  of  X./X._1 . 

For  c  other  than  0  or  ®f  there  is  no  closed-form  solution  to 
(III. E. 5. 5).  We  obtain  the  Huber  (c)  estimator  of  Y  by  iterating  the 
following  scheme: 


n 

l  x.x 

•  o  1 
1=2 


i-1 


X. -Y, x  .  .  i 

r  ’ 


2 

i-1 


w, 


-  Y.  X  .  .  i 

k  l-l  J 

S  J 

r  ' 


(III. E. 5. 7) 
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where  Y^  is  the  least  squares  estimator  of  Y  and 


S 

r 


median  |Xi| 


.69315 


t 


(III. E. 5. 8) 


is  the  scaling  constant  for  the  FT.  If  Y  ■  0,  then  is  the  median 
absolute  deviation  estimator  of  the  scale  parameter  in  the  Laplace 
distribution  as  given  in  Section  III.E.3.  Typical  values  of  c  are  1, 
1.5,  2.  We  use  for  illustration  c  =  1  in  the  simulation  along  with  YLg, 

A 

the  least  squares  estimate,  and  Y„,  the  median  (X./X.  , ) . 

M  1  1-1 

2)  The  Least  Absolute  Deviation  (LAD)  estimator  of  Y  is  the  minimizer 
of 

n 

l  jx.-Yx.  | .  (III. E. 5. 9) 

i-2  1 

A 

The  solution  is,  YWM>  the  weighted  median  of  xVx^  where 
the  weights  are  x^_ 1  for  i  =  2,...,n. 

Denby  and  Martin  [Ref.  38]  reported  that  the  Huber(c) 
estimates  are  consistent  and  asymptotically  unbiased  for  linear  AR(1) 
models.  Bloomfield  and  Steiger  [Ref.  39]  showed  that  the  LAD  estimator 
is  strongly  consistent  and  asymptotically  unbiased  for  linear  AR( 1 ) 
models.  In  Figures  III. E. 5.1  -  III. E. 5. ^  are  examples  from  SIMTBED  of 
the  behavior  of  these  estimators  in  simulated  data  from  LAR( 1 ) ,  a  linear 
A R (  1  )  model  with  Laplacian  marginals  and  AR(1)  correlation  structure 
given  in  Chapter  II.  These  results  appear  to  be  consistent  with  the 
results  reported  above  for  linear  AR( 1 )  processes.  The  leading 
coefficient  in  the  expansion  for  the  mean  of  each  estimator  does  not 
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Figure  III. E. 5. 3.  SIMTBED  Boxplot  Analysis  of  Huber (c)  Estimator  of  y  with  y=. 63662  and 

c=l  in  the  LAR(l)  Process 


differ  significantly  from  the  true  value,  0.63662.  We  also  see  that  the 
median  (Xi/Xi_1)  and  the  weighted  median  (Xi/Xi_1)  estimators  are 
considerably  more  efficient  than  either  the  Huber(c)  estimator  in  Figure 
III.  E.  5. 3  or  the  least  squares  estimator  (c  =»  ®)  in  Figure  III.  E. 5. 4. 

Since  the  least  squares  estimator  remains  asymptotically 
unbiased  for  the  BELAR(I)  process  as  was  shown  in  Section  III.E.4,  it 
was  of  interest  to  observe  how  the  Huber (c)  estimators,  c  <  ®,  and  the 
LAD  estimator  of  Y  would  behave.  Considering  the  ordering  suggested  by 
the  simulation  results  in  the  LAR(1)  process,  it  would  seem  possible 
that  the  Huber(c)  estimates  could  be  better  than  the  least  squares 
estimator  of  Y.  In  the  boxplot  analyses  in  Figures  III. E. 5. 5  - 
III. E. 5.8  are  the  results  of  the  simulation  for  Y  =  .  63662,  but  for 
data  from  the  BELAR( 1 )  process.  The  boxplots  in  Figure  III. E. 5. 5 
display  the  theoretical  behavior  of  the  least  squares  estimator  of  Y. 
The  other  estimators  of  Y  appear  to  be  converging  to  other  values 
Yq  *  Y.  To  see  this,  note  the  first  entry  in  the  coefficients  for  the 
asymptotic  expansion  of  the  mean  of  Y  in  Figures  III. E. 5.6  -  III. E.  5.8. 
In  each  case  YQ  >  Y.  Also  from  the  estimate  of  the  standard  deviation, 
we  assert  that  YQ  is  significantly  larger  than  Y  for  each  of  the 
alternative  estimators  investigated  here,  because  the  difference, 

| Y  -  Yq| ,  is  larger  than  four  standard  deviations. 

For  the  BELAR(  1 )  process,  we  observe  a  reversal  from  the 
LAR(1)  process  in  preference  for  the  estimator  of  Y.  We  will  use  the 
least  squares  estimator  as  the  initial  estimator  of  Y  in  the  iterative 
procedure  for  finding  the  maximum  likelihood  estimator  of  Y  which  we 
develop  next . 
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III. E. 5. 6.  SIMTBED  Boxplot  Analysis  of  the  Huber (c)  Estimator  of  y  with 
y=. 63662  and  c=l  in  the  BELAR(l)  Process 


6.  Maximum  Likelihood  Estimation  of  Y 
a.  Introduction 

In  this  section,  we  develop  the  maximum  likelihood  estimator 

A 

of  the  lag-1  serial  correlation  in  the  BELAR( , )  process,  Y  .  We  use 

MLCi 

the  expression  for  the  logarithm  of  the  likelihood  function,  L(a),  in 

(III. D. 2. 12)  in  an  iterative  procedure  to  find  the  values  of  o  and  the 
1  /2 

sign  of  (a,1-a),  that  minimizes  -L(ot);  call  the  pair  (a^g,  sign). 

1  /2 

Since  knowing  a  and  the  sign  of  An  (a,1-o)  uniquely  defines  Y,  can 

be  found  from  (III. E. 4. 8)  using  (a._  sign). 

MLh 

We  consider  only  the  univariate  problem.  That  is,  we  have 
assumed  that  { }  is  marginally  Laplace  distributed  or  have  determined 
from  Q-Q  plots  that  the  best  i-Laplace  fit  to  the  data  is  when  l  =  1  . 
Secondly,  we  assumed  that  {X^}  is  standard  Laplace  (p  =  0;  X  =  1)  or 

A 

that  (Xn)  has  been  standardized  using  a  pair  of  estimators  (p,  X)  from 
Sections  III.E.2.  and  III.E.3. 

As  a  function  of  a,  (III.D.2.12)  is  very  complicated.  There 
is  little  hope  of  being  able  to  analytically  solve  for  the  critical 
values  of  a.  In  fact,  the  evaluation  of  a  derivative  of  (III.D.2.12)  is 
at  least  as  expensive  computationally  as  the  function  values  themselves, 
since  (III.D.2.12)  contains  exponential  functions  of  a.  However,  since 
this  is  a  one-dimer.sional  optimization  problem,  there  are  IMSL  routines 
that  will  perform  the  search  without  using  deviati ves--Golden  Section 
search;  bisection  method;  or  interpolation  routines. 

We  chose  the  IMSL  routine  ZXLSF  which  performs  a  one¬ 
dimensional  search  for  a  minimum  of  a  smooth  function  in  a  closed 
interval  using  quadratic  interpolation.  The  FORTRAN  routine  which 
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evaluates  ( III.  D.  2. 12)  is  formulated  so  that  ZXLSF  is  searching  on  the 
interval  (-1,1)  where  a  <  0  implies  that  conditional  densities  of  the 
form  (III. D. 2. 10)  are  being  evaluated  instead  of  those  given  by 
(III. D. 2. 9)  when  a  >  0.  The  initial  value  for  a  to  start  the  iteration 
procedure  of  ZXLSF  is  a  four-digit  approximation  (&LS»  sign^g) 
corresponding  to  the  least  squares  estimate  of  serial  correlation,  Y^g, 
obtained  from  ( III. E. 4.8) . 

The  queston  of  accuracy  in  the  calculation  of  (III.D.2.12) 
is  especially  important  because  the  likelihood  surface  is  extremely  flat 
in  many  cases.  We  want  some  assurance  that  ZXLSF  is  efficiently 
searching  for  the  optimum  and  not  "chasing  roundoff  errors".  This 
happened  before  we  increased  the  accuracy  parameter  in  DCADRE  and  used 
double  precision.  In  order  to  assess  the  accuracy  of  our  calculations, 
we  constructed  first-  and  second-divided  differences  for  values  of  a  and 
(III.D.2.12).  The  divided  differences  are  approximations  for  the 
derivatives.  For  those  simulations  that  we  checked,  there  was  one 
transition  of  the  slope  through  zero  at  the  critical  point  found  by 
ZXSLF.  The  second-divided  differences  at  all  points  in  the  vicinity  of 
the  critical  value  were  positive  indicating  the  general  convex  upward 
shape  of  (III.D.2.12).  Sometimes  there  was  some  fluctuation  in  values 
of  the  second-divided  differences,  but  no  change  of  signs  near  the 
reported  optimum. 

The  fluctuating  values  of  the  second-divided  difference 
indicated  some  noise  in  the  calculations.  This  occurred  in  two  places. 
If  the  s  e  co  nd- di v i ded  difference  covered  points  on  both  sides  of 
a  =  1/2,  then  there  was  often  a  jump  in  the  value  of  the  second-divided 
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difference.  This  occurred  because  of  the  change  in  the  method  of 
calculating  the  conditional  density  when  a  changed  from  a  <  .5  to 
a  2  .5.  Other  times,  slight  aberrations  in  the  observed  pattern  of  the 
second-divided  differences  occurred  for  values  of  a  that  were  small, 
0  <  a  <  .15.  This  is  attributed  to  the  fact  that  DCADRE  evaluations  for 
the  table  of  values  of  the  ( 1 -a) -Laplace  density  (0  <  a  <  .15)  in  many 
subintervals  was  not  behaving  regularly.  The  computed  value  was 
accepted  because  the  estimated  error  was  small,  relative  to  the  accuracy 
requirements.  The  important  consideration,  however,  was  that  no  error 
in  calculating  (III. D. 2. 12)  should  be  so  large  as  to  falsely  indicate  a 
change  in  convexity  in  the  vicinity  of  an  extremum,  so  that  ZXLSF  would 
be  ineffective  at  locating  it. 

The  selection  of  a  good  starting  point  in  this  procedure  is 
also  important.  It  is  desirable  to  commence  the  iteration  in  ZXLSF  as 
close  to  the  global  optimum  as  possible  in  order  to  reduce  the 
possibility  of  converging  to  a  local  optimium.  Note,  also,  that  as  a 
function  of  at,  the  conditional  density  is  not  necessarily  convex  and 
often  is  not  even  unimodal  across  the  range  from  Y  =  +1  to  Y  =  -1 . 

Since  (III. D. 2. 12)  is  the  logarithm  of  the  product  of  such 
functions,  there  is  no  assurance  that  (III.D.2.12)  has  a  single  relative 
maximum  especially  for  small  sample  sizes.  When  the  sample  size  is 
small,  it  is  advisable  to  pick  a  starting  value  for  the  iteration  on 
both  sides  of  a  =*  0.  Select  the  maximum  likelihood  estimator  to  be  the 
one  with  the  higher  value  of  L(a)  if  the  routine  produces  two  different 
a's,  corresponding  to  the  pairs  (a^-O  and  (fi  ,-). 


Since  we  know  that  Y^g  is  a  consistent,  asymptotically 
unbiased  and  asymptotically  Normally  distributed  estimator  for  Y,  we 
chose  the  value  of  a  and  model  corresponding  to  Y^g  as  our  initial  guess 
in  ZXLSF. 

b.  Simulation  Results 


The  maximum  likelihood  routine  for  estimating  Y  was  tested 

in  simulations  using  computer  generated  data  from  the  BELAR(  1 )  process 

with  known  parameter  values  of  2,,  u,  X  and  a.  By  performing  M 

independent  simulations  of  sample  size  N  (where  N  is  increased  for  each 

set  of  M  simulations)  and  fixed  a,  we  were  able  to  compare  the  standard 

deviation  and  bias  (if  any),  of  Yu.  to  that  of  the  initial  least 

ML  h 

/V 

squares  estimator  Y^g,  for  which  the  asymptotic  distribution  is  Normal. 
Changes  in  the  Normal  plots  for  one  set  of  M  simulations  for  N  small  to 
a  second  set  of  M  simulations  for  a  larger  N  would  give  some  indication 


of  how  fast  Y„.  _ 
MLE 


is  or  is  not  converging  to  a  Normal  distribution. 


Both  M  and  N  were  small  in  the  simulations  for  two  reasons. 


Since  the  asymptotic  distribution  of  Y^g  was  known,  it  was  of  more 

interest  to  see  how  much  better  YWI  _  was  for  the  smaller  samples  (i.e., 

MLE 

* 

was  the  bias  smaller  for  Y...  „  or  was  it,  in  fact,  unbiased).  Secondly, 

MLL 

the  run  time3  for  calculating  (III.D.2.12)  for  N  <  200  was  long.  The 
evaluation  per  sample  of  size  N  =  25  ranged  from  100-300  secs.  For 
N  =  175,  the  run  times  ranged  from  700-950  secs. 

Figures  III. E. 6.1,  III. E. 6. 2  and  III. E. 6. 3  are  the  Normal 
plots  of  twenty  realizations  of  the  maximum  likelihood  estimator  of 
serial  correlation  and  the  least  squares  estimator  of  serial  correlation 
for  simulated  data  from  the  BELAR(I)  process  for  selected  values  of  a 
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Figure  III. E. 6.1.  Normal  Probability  Plots  of  the  Maximum  Likelihood  and  the 

Least  Squares  Estimators  of  y  in  the  BELAR(l)  Process  for 
20  Samples  of  Sizes  25  and  125  with  a=.ll  and  y=. 19216 


Figure  III. F,. 6. 2.  Normal  Probability  Plots  of  the  Maximum  Likelihood  and  the 

Least  Squares  Estimators  of  y  in  the  BELAR(l)  Process  for 
20  Samples  if  Sizes  25  and  175  with  a=.5  and  63662 


and  for  two  subsample  sizes,  SSN.  The  layout  provides  for  two-way 

A  A 

comparisons.  That  is,  Y...  „  from  smaller  SSN  can  be  compared  to  Y._  _  for 

MLh  MLh 

larger  SSN.  Likewise,  for  a  given  SSN,  YMT  r  can  be  compared  to  Y.c> 

ML  h  Lo 

which  is  known  to  have  an  asymptotic  Normal  distribution.  The  straight 
line  in  the  Normal  plots  corresponds  to  a  Normal  distribution.  The 
curved  lines  correspond  to  the  Kolmogor ov-Smirnoff  bounds  calculated 
from  the  data  at  the  95$  confidence  level  by  the  routine  in  the  IBM 
experimental  APL  routine  called  GRAFSTAT. 

It  appears  from  these  figures  that  for  the  larger  values  of 

iS  A 

SSN,  Ym^e  and  Y^s  fit  Normal  distributions  better  than  the  corresponding 
samples  from  the  smaller  values  of  SSN.  It  also  appears  that  Y^EE  fits 

a 

a  Normal  distribution  as  well  as  the  Y  for  the  larger  values  of  SSN. 

Lo 

This  supports  the  notion  that  the  maximum  likelihood  estimator  is 
converging  to  a  Normal  distribution. 

Figures  III. E. 6. 4,  III. E. 6. 5  and  III. E. 6. 6  are  the 
corresponding  scatter  plot  analyses  for  the  data  in  the  previous  figures 

A  A 

for  the  larger  value  of  SSN.  It  appears  that  Y  and  Y  have  a 

ML  L  Lo 

positive  correlation  coefficient  which  becomes  more  pronounced  as  the 
data  becomes  less  correlated.  The  distribution  of  Yw.  _  also  appears  to 
have  a  smaller  variance  than  Y^.  This  effect  is  more  pronounced  for 
more  highly  correlated  data.  Compare,  for  example,  the  estimated 

/V  A 

standard  deviation  of  YMI  „  and  that  of  Y.  from  the  table  in  Figure 

MLL  Lo 

III. E. 6. 4  with  the  corresponding  entries  in  the  table  from  Figure 
III. E. 6. 6. 
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Figure  III. E. 6. 4.  Scatter  Plot  Analysis  of  the  Maximum  Likelihood  and  the  Least  Squares  Estimators 

of  y  in  the  BELAR(l)  Process  for  20  Samples  of  Size  125  with  a=.ll  and  y=. 19216 


Scatter  Plot  Analysis  of  the  Maximum  Likelihood  and  the  Least  Squares  Estimators 
of  y  in  the  BELAR(l)  Process  for  20  Samples  of  Size  250  with  a=.844  and  y=~. 89986 


F.  i-LAPLACE  MOVING  AVERAGE  MODELS 


1 .  Introduction 

In  this  section,  we  derive  a  time  series  model  that  has  an 

H-Laplace  marginal  distribution  and  the  correlation  structure  of  a 
th 

linear  q  -order  moving  average  model.  This  construction  uses  the 
square  root  Beta-Laplace  transform  given  in  Section  III.B.3.  The  first- 
order  model  retains  the  full  range  of  correlations  up  to  1/2. 

2.  The  First-Order  Moving  Average  Model 

Let  {L  (&-a)}  be  an  i.i.d.  sequence  of  ( i-a) -Laplace  random 
n 

1  /2 

variables;  (A^  (a,  i-2o)  }  be  an  i.i.d.  sequence,  independent  of 

{L  (H-a)},  where  A  is  a  Beta  (a.,l-2a)  random  variable  and  0  <  a  <  1/2. 
n  n 

Both  the  innovation  and  the  coefficient  sequences  are  independent  of 
Xn-1  ’  Xn-2 .  Then  the  sequence  {X  (&}  given  by 

Xn(l)  =  LnU-a)  +  A^/2(a,H-2ct)Ln-1U-a)  ,  (III.  F.  2.1) 

has  a  marginal  2,-Laplace  distribution  and  an  MA(1)  correlation  structure 

such  that  0  <  Corr(X  ,X  . )  <  1 /2 . 

n  n-1 

To  see  that  X^Ol)  has  an  fc-Laplace  distribution,  first  note  that 

by  the  square  root  Beta-Laplace  transform  theorem  of  Section  III.B.3, 

1  /2 

the  distribution  of  the  product  A  (a,i-2a)L  , ( i-a)  is  a-Laplace. 

n  n-1 

Then  note  that  X  (fc)  is  the  sum  of  two  independent  random  variables,  one 
of  which  has  an  ( i-a) -Laplace  distribution  and  the  other  has  an  a- 
Laplace  distribution.  So,  if  <j>  (to)  is  the  characteristic  function  of 

A 

X  ( l) ,  then 
n 


(  1  +0)2  J  [l+U)2j  [  1  +U)i  j 


(III. F. 2. 2) 


To  see  that  (Xn(£)}  has  the  correct  correlation  structure,  first 
note  that  by  the  construction  of  (III. F. 2.1),  X^_k  is  explicitly 
independent  of  for  |k|  £  2.  Therefore,  Corr  (X^  ,Xn_  ^ )  is  zero  for 
|k  |  *  2. 

For  k  =  ±1  ,  we  have,  after  some  simplification 


Corr (X  ,X  ,  )  = 
n  n-k 


aT(  r(  2,+1  -a) 

J.r(a+1)rU-a+^) 


(III. F. 2. 3) 


Finally,  note  that  in  the  limit  as  a  +  0,  (III.F.2.3)  approaches 

zero.  Also,  as  a  -*  i/2,  (III.F.2.3)  approaches  1/2. 

Replace  A1^(a,2,-2a)  in  (4.1)  by  -A^/2(a, H-2a) ,  we  have  a  full 

range  (-1/2,0)  of  nonpositive  lag-1  serial  correlations. 

3.  The  q-Order  Moving  Average  Model 

The  MA  ( q )  model  for  q  2  2  is  the  extension  of  the  MA(1)  model 

given  in  (III.F.2.1).  Let  {Ln(l-qa)}  be  an  i.i.d.  sequence  of  (8,-qa)- 

1  /2 

Laplace  random  variables.  Let  [A  .  { a,  2,-(q+1 )  a)  ]  for  i  =  1,...,q  be 

n ,  l 

i.i.d.  sequences,  independent  of  each  other  and  of  { Ln  (  d-qa)  } ,  where 

A  is  a  Beta  {  a  ,  2.  -  (  q  + 1  )  a }  random  variable  for  all  n  and  all 
n  ,  l 

i  =  1,...,q.  Also,  0  <  a  <  2,/(q  +  1  )  .  Both  the  innovation  and  each  of 

the  coefficient  sequences  are  independent  of  X  ,  ,X  ^,....  Then  the 

n-1  n-2 

sequence  {X  (8,)}  given  by 


(III. F. 3.1) 


XnU) 


r  1  /2 

,  ( £~qa)  +  l  i(a,)l-(q+1  )a}Ln_i(£-qa) 

a  i=l  ’ 


has  a  marginal  i-Laplace  distribution  and  an  MA(q)  correlation  structure 

f  or  0  <  a  <  £/(q+1).  When  a  =  0,  then  { ( 2.) }  is  an  i.i.d.  sequence; 

when  a  =  £/(q+1),  then  the  moving  average  is  an  equal  weighted  average 

of  q+1  i.i.d.  a-Laplace  error  terms  L  (a). 

n 

To  see  that  X  (£)  is  an  £-Laplace  random  variable,  first  note 
n 

from  the  square  root  Beta-Laplace  transformation  theorem  o'1  Section 

1  /2 

III.B.3,  that  each  product  A  .{a,  £-(q+1)a}L  _.(£-qa)  has  an  a-Laplace 

n  1 1  n- 1 

distribution . 

But  the  sum  of  q  i.i.d.  a-Laplace  random  variables  has  a  qa- 
Laplace  distribution.  Thus,  X  (£)  is  the  sum  of  two  independent  random 
variables  and  its  characteristic  function  is  obtained  as  the  product 


I 


£-qa 


q 

n 


i«1 


a 


_J _ )£-qa  f_1 _ )qa  _  (_1 _ 

1  +u>2  J  [  1  +u>2  J  ( 1  +w2 ) 


(III. F. 3. 2) 


The  correlations  are  truncated  at  lags  |k|  Z  q+1.  By  the 
construction  of  (III.  F.  3.1),  Xn  is  explicitly  independent  of  for 

M  2  q+1 . 

Negative  correlations  are  obtainable  with  2°*  choices  for 

replacing  or  not  replacing  [ A1 /2 ( a, H— (q  +  1 ) a) ]  by  [-A 1 /2 ( a, £=( q  +  1  ) a) ]  in 

n  j  i  n  ■  1 


( III.  F.  3.D. 


This  model  can  be  generalized  from  the  one-parameter  case  by 


replacing  qa  in  (III. F. 3.1)  with  a.  in  each  term  in  the  sum, 

q 

replacing  Ln(l-qa)  by  L»n(  H— q  Z  a^). 


and 
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IV.  RESIDUAL  ANALYSIS  COMPARISON  OF  THE  NLAR( 1 ) 

AND  THE  BELAR(I)  PROCESSES 

A.  INTRODUCTION 

Lawrance  and  Lewis  [Ref.  22]  developed  a  higher-order  residual 

analysis  for  non-linear  time  series  with  autoregressive  correlation 

structures.  Specifically,  they  developed  a  third-order  analysis  based 

on  the  cross-correlation  of  the  linear  residual,  R  ,  and  its  square  at 

lag  k,  R2_k>  They  applied  the  analysis  to  the  problem  of  modelling  wind 

speed  data.  It  is  important  to  note  that  this  analysis  was  done  in 

conjunction  with,  and  not  in  place  of,  the  usual  second-order  analysis. 

As  has  been  already  pointed  out,  second-order  analysis  is  sufficient  for 

modelling  only  when  the  processes  are  both  linear  and  Normal. 

The  residual  analysis  involves  only  joint  moments  of  order  three. 

In  Chapter  II  of  this  thesis,  it  was  shown  that  for  the  NLAR(p)  models 

with  p  =  1,2,  all  the  third-order  moments--that  is,  those  of  the  form 

ECX.X.X  )  for  all  i,  j,  k — are  zero.  Therefore,  the  Lawrance  and  Lewis 
1  J  K 

residual  analysis  will  not  differentiate  between  the  NLAR(p)  processes 

with  the  same  autocorrelation  structure.  It  can  also  be  shown  by 

induction  on  k  that  Corr  (R  ,  R2  .  )  =  Corr  (XZ,R  .)  =  0  in  the  BELA  R  (  1  ) 

n  n-k  n  n-k 

process.  Hence,  either  third-order  residual  analysis  will  be  unable  to 
discriminate  the  BELAR(I)  process  from  any  of  the  NLAR(1)  processes  with 
the  same  autocorrelation  function. 

In  the  spirit  of  looking  at  the  lowest  possible  joint  moments  for 
differentiating  between  models  with  symmetric  marginals,  a  fourth-order 
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analysis  is  proposed.  Two  candidates  are  investigated  as  the  basis  of 


this  analysis.  The  first  one  considered  is  the  cross-correlation  of  X3 
J  n 

and  the  linear  residual  at  lag  k,  R  ,  .  The  second  is  the 

6  n-  k 

autocorrelation  of  R2  and  R2_k-  The  two  analyses  are  compared  by  their 

abilities  to  differentiate  among  the  different  types  of  NLAR(1) 

processes  and  the  BELAR( 1 )  process. 

Table  IV. A.  1.  contains  a  summary  of  the  models  in  the  comparison, 

along  with  the  selected  sets  of  parameter  values  and  corresponding 

correlation  coefficient.  Even  though  each  of  the  models  has  the  same 

marginal  distribution  (standard  Laplace)  and  identical  autocorrelation 

functions,  each  has  a  theoretical  cross-correlation  function  in  terms  of 

(X 3 , R  .  )  and  autocorrelation  function  for  (R2,R2  ,  )  that  are  different, 
n  n-k  n  n-k 

The  question  of  how  the  residual  analysis  is  affected  by  parameter 
estimation  is  an  important  issue,  but  is  not  addressed  at  this  time. 

Before  the  candidates  are  developed  in  the  next  two  sections,  it  is 
convenient  now  to  place  both  the  NLAR(  1  )  and  BELAR(  1 )  processes  into 
their  common  RCA(1)  framework. 

Using  the  terminology  of  Nicholls  and  Quinn  [Ref.  16],  both  the 
NLAR(1)  and  the  BELAR(I)  processes  can  be  written  as 


X  =  cX  +  B  X  ,  ♦  e  , 
n  n-1  n  n-1  n 


(IV. A. 1) 


where  { e:^ }  is  the  i.i.d.  innovation,  E(e  )  =  0,  and  otherwise  defined  as 
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1.  (1 -a) -Laplace  in  the  BELAR( 1 )  process; 

2.  standard  Laplace,  but  with  a  degenerate  component  at  the  origin  in 
the  LAR( 1 )  process; 

3.  scaled  Laplace  where  A  »  /  1  -a  in  the  TLAR( 1 )  process; 

4.  convex  mixture  of  scaled  Laplace  variables  in  the  general  non¬ 
boundary  NLAR(1)  process. 

TABLE  IV. A. 1 

I  k  I 

Summary  of  Models  with  Laplace  Marginals  and  Autocorrelations  of  Y1  1 


Model 

Parameter  Values 

Y 

Comments 

a1  = 

1  ; 

B1  -  .19216 

.19216 

Linear  models; 

LAR(1) 

a1  = 

1  ; 

B1  =  -.63662 

- . 63662 

one  boundary  of 

a1  = 

1  ; 

81  =  .89986 

.89986 

NLAR(1)  family. 

a1  = 

=■  .43836 

.19216 

General  discrete 

NLAR(I) 

a1  = 

.797885;  B.  -  -a. 

-.63662 

random  coefficient 

a1  3 

81 

-  .94861 

.89986 

model . 

a  =  . 

11; 

positive  model 

.1921 6 

General  continuous 

BELAR(I) 

a  =  . 

50; 

negative  model 

- .63662 

random  coefficient 

a  =  . 

884 

;  positive  model 

.89986 

model . 

a1  = 

.19216;  6 1  =  1 

.1921 6 

Other  boundary 

TLAR(  1  ) 

a1  = 

.63662;  61  -  -1 

- . 63662 

model  of  NLAR( 1 ) . 

a1  = 

.89986;  81  =  1 

.89986 

The  {B^}  process  is  the  i.i.d.  random  coefficient  process, 

independent  of  (e  }  and  {X,  }  for  k  £  n-1  with  E(B  )  =  0  and  otherwise 

n  k  n 
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■■  * afi «  •»  >0 


defined  as: 


1.  ±(  A^2 ( a,  1  -a)  -  Y)  ,  where  Y  -  E{ A^^Ca,  1-a) }  and  A  (a,1-a)  is  a 
standard  Beta  random  variable  in  the  BELAR(I)  process; 

2.  0  in  the  LAR( 1 )  process,  since  it  is  a  linear,  constant 
coefficient  AR(1)  process; 

3.  8.j  {K^-a}  in  the  other  NLAR(  1 )  processes,  where  is  a  Bernoulli 

random  variable  such  that  P(K'  -  1)  =  a,  and  0  £  1 8,  I  £  1  and  a , 

n  1  1  1  1  1 

and  61  are  not  both  unity.  At  81  =*  ±  1  the  process  is  the  TLAR(1) 
process . 

The  c  is  a  constant  defined  as: 

1.  Y  -  E{ A^/2(a, 1-a) }  in  the  BELAR(I)  process; 

2.  a.8,  =  8,E(K')  in  all  the  NLAR(1)  processes  (a.  =*  1  being  the 

i  1  in 

LAR( 1 )  process). 

The  second  and  fourth  moments  of  E^  and  the  second,  third  and  fourth 
moments  of  B^  are  needed  in  the  following  sections.  In  Table  IV.  A.  2, 
there  is  a  convenient  summary  of  the  necessary  equations. 

Now  the  linear  residual,  written  in  terms  from  (IV. A. 1)  has  the 
following  forms  analogous  to  (III. E. 4. 3)  and  (III. E. 4. 4), 


R  =  B  X  +  e  , 
n  n  n-1  n 


R  =  X  -  cX  ,  . 
n  n  n-1 


(IV. A. 2) 


(IV. A. 3) 


Using  (IV. A. 2)  and  the  independence  of  (B  }  and  (e  },  the  second  and 

n  n 

fourth  moments  of  R  are 

n 
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(IV. A. 4) 


E(R2)  -  2E( B2)  +  E(e2) , 
n  n  n 

E(R-)  =  24E(  B1* )  +  1  2E(  B2 ) E( e2 )  +  E(e“).  (IV. A. 5) 

n  n  n  n  n 


The  variance  of  R2  when  needed  is  derived  from  (IV. A. 4)  and  IV. A. 5). 
n 

B.  RESIDUAL  ANALYSIS  USING  Corr(X3  ,R  .  ) 

n  n-k 

In  this  section,  the  residual  analysis  using  the  theoretical  cross¬ 
correlations,  Corr(X3,  Rn_k)  is  developed.  Using  (IV. A. 1)  and  (IV. A. 2), 
we  have 


X3 

n 


c3X 


n-1 


3o  2X  2  R 
n- 1  n 


3cX  , R2 
n-1  n 


Rn> 


(IV.B.1) 


where  c  is  defined  in  Section  IV. A. 

The  cross-correlation  function  of  X3  and  R  ,  at  lag  k  is  defined  as 

n  n-k 

E(X  3R  ) 

C  (k )  =  Corr (X 3 , R  )  =  - n  n~  ,  (IV. B. 2) 

31  n  n-k  °x3°r 

An  nn-k 

where  Var(X3)  =  E(X6)  =  6!  and  Var(R  .  )  is  given  by  (IV. A. 4)  for  all  n 
n  n  n-k 

and  all  k,  since  as  shown  in  Section  III.E.3,  is  stationary 

whenever  {X  }  is. 
n 

Now  from  the  construction  of  Rn  in  (IV. A. 2),  it  is  explicity  clear 

that  X  and  R  ,  are  dependent  for  all  k  and  that  the  {R  }  are  not 
n  n-k  K  n 


independent  of  each  other,  unless  B  is  identically  zero  as  in  linear 


constant  coefficient  AR( 1 )  processes.  However,  by  the  Residual  Theorem 

(Lawrance  and  Lewis  [Ref.  22]),  the  {R  }  are  uncorrelated. 

n 

From  (IV.B.1),  we  have  for  all  k 


C 


31 


(k) 


l0'E<Xn-,Bn-k>  *  30*E(x;.,B„Rn.k)  *  3=E(X 

*  E(R’R  ))  /  [12/T  {£(R!))1/2]. 
n  n-k  n 


n-1 


R2R  ,  ) 
n  n-k 


(IV. B.  3) 


Consider  (IV. B. 3)  for  k  <  0.  Since  the  random  coefficients  {B^} 
are  independent  of  the  process  { Xj }  for  j  £  n-1,  this  implies  that 
C„(k)  is  zero  for  k  <  0.  For  k  =  0  in  (IV. B. 3),  we  have,  after  some 
simplification, 


V0) 


72c!E(B"  )  *  6c’E (£J  )  *  72cE(B’)  *  E(R') 
_ n _ n  n _ n 

12/”  {  E(  R2  ) } 1  72 
n 


(IV.B.4) 


For  k  5  1,  there  is  the  following  recursive  formula, 


ck ( 1 -c2) E( e  2) 

C  (k)  =  C  (K-1){c3  ♦  3cE(B"  )  ♦  E( B3 )  }  +  - -  .  (IV. B. 5) 

31  31  n  2/— {£(r2)}1/2 

n 


It  is  now  a  simple  matter  to  consolidate  the  expressions  for  C  (k) 
for  all  k  and  substitute  the  appropriate  expressions  from  Table  IV. A.  2. 
For  the  NL AR( 1 )  models--including  LAR( 1 )  ,  for  which  a,  =  1 ,  and  TLAR(1) 
for  which  81  =  ±1 --we  have 
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0, 


k  <  0; 


C31 (k) 


{2  -  ofBf  +  6a1  8^  1~2a1 )  (1-a1 )  -  a281‘,(8-19a1 +1  2a2) } 


/  10  (1-ajBj) 


1  /2 


,  k  -  0; 


a1813C31  (k-1)  + 


ojejd-o^pd-a^;)172 

/To 


( IV. B. 6) 


k  >  1 


For  the  BELAR(I)  process,  we  have 


0, 


k  <  0; 


c31(k) 


(6  -  5Y2  -  oY2) 

3 /To  o-y2)1'2’ 


k  -  0;  (IV. B. 7) 


|(1+2o)C31 (k-1) 


+  Tk(1-o)(1-T2)1/2 

/To 


k  >  1. 


The  theoretical  cross  correlation  functions  for  each  of  the  models 
and  sets  of  parameters  in  Table  IV. A. 1  are  given  in  Figures  IV.B.1  - 
IV. B. 3.  Three  points  can  be  made.  For  the  models  with  |p|  small,  such 
as  in  Figure  IV.B.1,  there  is  little  difference  between  the  cross¬ 
correlation  functions  of  all  four  models.  (Of  course  for  p  =  0,  there 
is  absolutely  no  difference,  since  all  NLAR(1)  models  and  the  BELAR(I) 


model  collapse  into  the  unique  i.i.d.  case).  A  difference  between  the 
cross-correlation  function  of  the  boundary  NLAR(1)  models — L A R ( 1 )  and 
TLAR(1) — does  become  more  apparent  as  |p|  increases.  But,  there  seems 


THEORETICAL  CROSS-CORRELATION  OF  Xn^  AND  R 

LAR(  1 )  or,  =  1  8,=p  p=  19216  NLAR(  1 ):  a,  =  B,=p5  p=  1 


Theoretical  Cross-Correlation  Functions  of  and 
RCA ( 1 )  Processes  with  p(l)=. 19216 


THEORETICAL  CROSS-CORRELATION  OR  Xn3  AND  R 

LAR(I):  a.  =  1  8.=p  p  =  -  63662  NLAR(1):  er ,  =p  ^  p* 


Theoretical  Cross-Correlation  Functions  of  X  and  R  for 
RCA ( 1 )  Processes  with  p(l)=-. 63662  n  n_k 


THEORETICAL  CROSS-CORRELATION  OE  Xn3  AND  Rn_k 

LAR(t)  c»,  =  1  B,=p  p=  89986  NlAR(t):  a,  =  B,=p5  p=  89986 


Theoretical  Cross-Correlation  Functions  of  X  and  R  for 
RCA ( 1 )  Processes  with  p(l)=. 89986  n  n-k 


to  be  little  distinction  between  the  cross-correlation  functions  of  X3 

n 

and  Rn_k  from  the  BELAR(I)  process  and  the  non-boundary  MLAR( 1  )  process 

with  0^=  B  i  =  / 1  p  |  even  when  |p|  is  large  as  in  Figure  IV. B. 3-  This 
final  point  suggests  the  possibility  that  there  exists  a  pair  of  values, 
(o^B^),  whose  product  is  p  4  0,  for  which  the  BELAR(I)  process  and  the 
corresponding  NLAR(1)  process  will  not  only  have  identical 
autocorrelation  functions,  but  may  also  have  nearly  identical  cross¬ 
correlations  of  X3  and  R  ,  for  some  specified  number  lags 

n  n-k 

k  =  0, 1 ,  . . .  ,j . 

The  cr oss - corr el  at i ons  of  X3  and  R  ,  can  also  be  used  to 

n  n-k 

distinguish  the  random  coefficient  A R ( 1 )  processes  with  a  standard 
Laplace  marginal  distribution  from  the  Gaussian  A R ( 1 )  process  where 
X^  -  N(0,2)  and  -  N  ( 0  ,  2  (  \  -  a2 )  )  ,  where  a  is  the  correlation 
coefficient.  We  have  for  the  Gaussian  A R ( 1 )  models, 

0  k  <  - 1  , 

( 3/5 ( 1 -a2 ) 1 /2  k  =  0,  (IV. B. 8) 

akC31 (0)  k  >  1 . 

Note  that  is  C31(k)  for  k  M  is  proportional  to  Corr (X^ >Xn_k )  =  a  . 
This  is  consistent  with  the  fact  that  a  Gaussian  process  is  completely 
determined  by  the  mean  and  covariance  matrix. 

Figures  IV. B. 4  -  IV. B. 6  show  the  theoretical  cr oss - cor r el  at i on 
function  of  the  Gaussian  A R ( 1 )  model  superimposed  on  the  values  for  the 
different  models  from  Figures  IV.B.l  -  IV. B. 3,  which  have  the  standard 


C310O  -  Corr (X^,  Rn_k )  - 
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RESIDUAL  ANALYSIS  COMPARISONS  USING  CORR(X 


Analysis  Comparisions  Using  Corr (X^, Rn_^)  for  the 
AR(1)  Process  and  the  4  RCA(l)  Processes  with 


RESIDUAL  ANALYSIS  COMPARISONS  USING  CORR(Xn3,Rn_k) 

p  =  .89986 


Laplace  marginal  distribution.  There  is  some  differentiation  between 


the  Laplace  models  with  AR(1)  correlation  structure  and  the  given 

Gaussian  AR(1)  model,  but  not  much.  It  would,  however,  be  very  easy  to 

identify  the  Gaussian  model  from  the  Laplace  models  using  probability 

plots.  This  illustrates  the  point  made  at  the  beginning  of  this 

chapter,  that  a  higher-order  residual  analysis  is  not  intended  to 

replace  the  existing  methods  of  analysis.  It  also  emphasizes  one  of  the 

very  foundations  of  the  thesis,  that  the  marginal  distribution  is  one  of 

the  very  first  aspects  of  a  time  series  that  should  be  examined. 

C.  RESIDUAL  ANALYSIS  USING  Corr (R2 , R2_k ) 

In  this  section,  the  residual  analysis  using  the  theoretical 

autocorrelations,  Corr (R2 , R2_^)  is  developed. 

Let  C_„(k)  represent  Corr(R2,R2  .)  for  all  k.  Since  the  correlation 
22  n  n-k 

function  is  an  even  function  and  {R^}  is  stationary,  C22(k)  =  C22 ( - k ) . 
We  represent  only  k  »  0,1,2,...  .  Using  (IV. A. 2),  we  have  after  some 
simplification  for  k  Z  1, 


c22(k) 


{E(R*R*  ,  )  -  E(R’)E(R*  ,.))  /  (o.jO.,  ) 


n  n-k 


n  n-k 


R  R  ,  ' 
n  n-k 


*  2BnV.£„  *  2  <«»■«=*  > 


n  n-k 


R 2  R  2  ,  ' 
n  n-k 


[E(Bn)E(Xn-1Rn-k}  +  E(R“-J{E(e*)  “  E(R")}  '  > 


n-k  n 


R2  R2  ,  ' 
n  n-k 


lE(Bn>  CovCX*_,,R*_k)J  /  <0r20r!  ) 

n  n-k 


Cr.  i— * .  *a'm  —‘a  a  .  u  f.  a  a  J  j  .  a  ..  i _ ...  i  -  ,  _  .  *  *  W  fl  k  .*■  k  .*1.  mV 


E(B  =  )  CovU’, /  Var-tH*). 


(IV.C.1) 


Now  an  immediate  advantage  to  the  analysis  based  on  (IV.C.1)  as 

opposed  to  that  based  on  Corr(X3,  R  .  )  is  apparent.  For  the  constant 

n  n-k 

coefficient  models,  LAR(1),  will  have  a  spike  at  lag-0  and  be 

zero  for  all  other  lags,  since  B  =0.  This  will  not  be  the  case  for 
the  other  NLAR( 1 )  random  coefficient  processes  or  in  the  BELAR(I) 
process.  It  will  not,  however,  distinguish  the  LAR(1)  process  from  any 
linear  ARC  1 )  process.  This,  however,  can  be  achieved  using  probability 
plots  as  mentioned  previously. 

To  derive  a  computational  formula  from  (IV.C.1.)  in  terms  of  the 

parameters  of  the  process,  first  let  E„-(k)  =  E(X2R2  ,  ).  Then, 

22  n  n-k 

substituting  in  (IV. A. 1)  and  (IV. A. 2),  we  have,  after  some 
simplification  for  k  =  0, 


E  (0)  =  E{(cX  ♦  B  X  +  e  ) 2  ( B  X  +  e  )2} 

22  n-1  n  n-1  n  n  n-1  n 

=  E(e“)  +  2cE(e2)  +  12E(e2)E(B2)  +24c2E(B2) 
n  n  n  n  n 

+  U8cE(B3)  +  24E(Bh) . 
n  n 


(IV.C.2) 


For  k  >  1 ,  we  have  the  recursion 


E  (k)  =  {c2  +  E(B2) }E„_(k-1)  ♦  E( e 2 ) E( R2  ). 
22  n  22  n  n-k 


(IV. C. 3) 


Again  using  the  stationarity  of  { }  and  { R 2 }  ,  we  have  the  following 
expression  for  the  autocorrelation  function 


telFT  {E22(k_1)  “  2E(Rn)}*  k  -  U  (IV. C. 4) 
n 

For  the  non-LAR(l)  cases  of  the  NLAR(I)  process,  we  substitute  from 
Table  IV. A. 2  and  equations  (IV. A. 4)  and  (IV. A. 5)  to  obtain 


E22(k) 


C22(k) 


4{6  -  0*6* (5+1  2B^-1 1  ot1  Bp), 
a1®1E22^k_1^  +  ^(  1-ai  Bp  (1~o* 6* )  , 


Jo1  ( 1-o1 )  6*{E22(k-1 )  -  4(1-a*8*)} 
4'{5-o*B*(4  +  246*  -  42(^8*  +  19a2B*: 


k  -  0; 

k  >  1.  (IV. C. 5) 


k  -  0; 
k  £  1  . 


(IV. C. 6) 


The  corresponding  results  for  the  BELAR(I)  process  are 


E22(k>  - 


c22(k) 


1  2( 2+aYz-3Y2 )  , 

k 

-  0; 

aE22(k-1)  ♦  4(1-a) (1-Y2)  , 

k 

>  1. 

1  , 

k 

=  0; 

(ct-Y2)(E22(k-1)  -  4(1-Y2)} 
4(5-12Y2+26aY2-19Y“)  ’ 

k 

>  1  . 

(IV.C.T) 


(IV. C. 8) 


The  theoretical  autocorrelation  functions  for  each  of  the  models  and 
sets  of  parameters  in  Table  IV. A. 1  are  given  in  Figures  IV. C. 1-3.  There 
appears  to  be  more  discrimination  between  the  TLAR( 1 )  model  and  the 
other  random  coefficient  models  with  Corr (R2 , R2  ,  )  than  with 
Corr(X2,  Rn_k),  even  when  |p|  is  small,  as  seen  in  comparing  Figures 
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THEORETICAL  AUTOCORRELATION  OE  Rn2  AND  R 

NLAR(1):  a,=B,=p5  p=  19216  BELARf  1 V  a=  11  n= 
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THEORETICAL  AUTOCORRELATION  OF  R/  AND  R 

NLAR(1):  a,=p5  B,  =— Of,  p=  -  63662  BELAR(l):  or,  =  5  p=- 


RCA(l)  Processes  with  p(l)=~. 63662 


THEORETICAL  AUTOCORRELATION  OE  RJ  AND  R 


Figure 


IV.C.1  and  IV.B.1.  There  is  still  little  discrimination  between  the 

BELAR(I)  model  and  the  given  non-boundary  NLAR(1)  model.  However,  the 

important  point  is  that  since  the  LAR ( 1 )  model  is  a  linear  AR(1)  model, 

Corr(R2,R2  ,)  -  0  for  all  values  of  p  and  for  all  k  -  ±1,  ±2,.... 

n  n-k  K 


V.  EXTENSIONS  AND  OPEN  PROBLEMS 


A.  INTRODUCTION 

During  the  discussion  in  the  previous  chapters,  possible  extensions 
and/or  unresolved  issues  have  been  mentioned.  At  this  point,  we  con¬ 
clude  by  summarizing  some  of  the  directions  in  which  this  research  could 
be  continued.  There  are  still  significant  contributions  to  be  made, 
particularly  in  parameter  estimation,  model  development  and 
applications . 

B.  ESTIMATION 

There  are  several  open  questions  and  extensions  in  the  area  of 
parameter  estimation  and  inference  for  this  class  of  stochastic 
processes . 

First,  there  is  the  need  to  obtain  theoretical  results  substantiat¬ 
ing  the  empirical  results  from  the  simulation  of  the  maximum  likelihood 
estimator  (m.l.e.)  of  serial  orrelation  in  the  TLAR(1)  and  the  BELAR(I) 
processes.  Several  researchers  have  written  on  the  subject  of  maximum 
likelihood  estimation  in  dependent  sequences.  Much  of  this  is  assembled 
in  the  books  by  Basawa  and  Prakasa  Rao  [Ref.  42]  and  by  Basawa  and  Scott 
[Ref.  43].  It  is  not  known  whether  the  conditions  on  the  conditional 
densities  are  satisfied  in  the  cases  of  these  random  coefficient  AR(  1 ) 
models  to  prove  that  the  m.l.e.  is  consistent,  asymptotically  efficient 
or  asymptotically  Normal.  Conditions  for  the  existence  of  the  m.l.e's 


are  generally  extremely  complicated  and  difficult  to  verify  unless  the 
log  likelihood  is  absolutely  continuous  in  the  parameter  space. 

A  second  problem  to  resolve  is  that  of  existence  and  uniqueness  of 
the  maximum  likelihood  estimators  of  (0^,^)  in  the  NLAR(  1 )  process.  In 
this  case,  the  log  likelihood  is  definitely  not  differentiable  with 
respect  to  the  parameter,  6^,  nor  is  it  clear  that  there  is  a  unique 
maximum.  It  appears  from  contour  plots  of  the  log-likelihood  function 
over  a  grid  of  values  in  ( a ^ , 6 ^ )  coordinates  that  there  is  a  unique 
local  optimum  within  the  region  bounded  by  0  <  <1  and  —  1  <  6 1  <  1 
for  large  enough  samples  of  {X^}.  A  non-linear  optimization  technique 
that  uses  only  function  values  and  not  derivatives  seems  to  be 
appropriate,  since  the  log-likelihood  function  is  not  differentiable 
everywhere  with  respect  to  f^. 

A  third  problem  involves  the  4-Beta-Laplace  AR(1)  model.  Except  for 
the  case  when  4  is  assumed  to  be  one  (the  BELAR(I)  model),  the 
likelihood  function  in  (a, 4)  has  not  been  derived.  This  is  primarily  a 
numerical  issue  since  neither  the  density  of  for  non-integer  values 
of  4  nor  the  conditional  density  of  X  given  X  ,  for  any  values  of 
4  >  0  have  closed-form  expressions. 

A  fourth  issue  in  estimation  is  to  extend  the  maximum  likelihood 
approach  to  include  the  joint  estimation  of  the  scale  parameter  of  the 
marginal  distribution  to  that  of  the  shape  parameter  and  the  serial 
correlation  coefficient.  There  is  no  reason  to  assume  that  the  marginal 
distribution  should  always  be  a  standard  Laplace  or  standard  4-Laplace. 


Finally,  there  is  the  issue  of  quantile  estimation  in  the  random 
coefficient  models.  Empirical  results  are  given  only  for  the  BELAR(I) 
process  for  the  distribution  of  the  sample  median.  Theoretical  results 
are  related  to  mixing  conditions.  Based  on  a  new  mixing  condition, 
which  has  been  shown  to  be  satisfied  by  linear  AR( 1 )  processes 
[Ref.  44],  Gastwirth  and  Rubin  derived  the  asymptotic  Normal  theory  of 
quantile  estimation  for  the  linear  LAR( 1 )  process.  The  open  question  is 
whether  the  mixing  condition  of  Gastwirth  and  Rubin  is  satisfied  by  any 
of  the  random  coefficient  models — NLAR(  1 )  ,  BELAR(  1  )  or  i-Beta-Lapl ace 
AR  ( 1 )  . 

C.  MODEL  DEVELOPMENT 

Advances  in  modelling  can  be  made  in  developing  scalar  models  with 
p-th  order  autocorrelation  structure,  as  well  as  bivariate  autoregres¬ 
sive  models. 

An  open  question  in  the  development  of  the  NLARMA ( p , q )  family  of 
models  is  the  existence  of  the  general  class  of  models  with  p-th  order 
autocorrelation  structure — NLAR(p)  for  p  >  3;  specifically,  it  is  to 
derive  the  distribution  of  the  i.i.d.  innovation  sequence  {e^}.  This  is 
only  known  for  the  TLAR(p)  subclass  of  a  proposed  NLAR(p)  family. 

A  similar  question  is  open  for  p  £  2  in  the  continuous  random 
coefficient  models  with  an  2,-Laplace  marginal  distribution.  The  actual 
structure  of  the  model,  as  well  as  the  distribution  of  the  innovation  is 
in  question. 

There  is  also  a  need  for  multivariate  time  series  in  many  fields  of 
physical  science.  The  NEAR(2)  framework  was  used  by  Dewald  and  Lewis 


[Ref.  24]  to  derive  a  bivariate  exponential  AR(  1  )  model.  Such  an 
extension  is  also  possible  with  the  NLAR(2)  model.  Just  how  one 
estimates  the  eight  possible  parameters  in  such  a  model  is  an  open 
question . 

Related  to  the  model  development  and  parameter  estimation  is  the 

need  to  identify  particular  models.  Higher  order  residual  analyses  have 

been  based  on  the  linear  residual  R  =  X  -  a,X  .  -  a„X  _.  Since  the 

NLAR( 2)  model  is  only  partially  time  reversible,  it  is  possible  that  the 

reversed  residual  R  =  X  a,X  ,  -  a_X  ,  could  be  used  in  model 

n  n  1  n+1  2  n  +  1 

identification  as  well.  These  were  introduced  by  Lawrance  and  Lewis 

[Ref.  6,  45]  but  their  use  has  not  been  explored  in  any  context. 

There  is  also  the  question  of  the  effect  that  estimating  a1  and  a 2 

from  { Xn }  will  have  on  the  sample  autocorrelation  of  (R*,R2n_k)  and  the 

cross-correlation  of  (X 3 ,  R  .  )  in  the  fourth-order  residual  analyses 

n  n-k  3 

proposed  in  Chapter  IV. 

D.  APPLICATIONS 

In  Chapter  I,  several  areas  have  been  noted  where  the  modelling  is 
accomplished  with  heavy-tailed  distributions,  notably  in  voice  and 
acoustics  modelling,  as  well  as  in  image  coding.  In  these  areas,  the 
Laplace  distribution  and  the  symmetric  Gamma  distributions  are  widely 
used.  There  is  the  possibility  that  the  Jt-Laplace  for  t  <  1  could  also 
be  a  useful  alternative  to  the  symmetric  Gamma.  One  advantage  of  the 
2,-Laplace  distribution,  which  is  the  difference  of  two  i.i.d.  Gamma  ( l, \) 
is  the  simplicity  of  the  form  of  the  characteristic  function. 


Another  field  in  which  the  £ -Laplace  models  could  be  useful  is  in 
the  modelling  of  the  directional  components  of  wind  speed.  Models  with 
skewed  marginal  distributions  have  been  fitted  to  data  and  then 
transformed  either  to  Normals  (for  example  by  Brown,  Katz  and  Murphy 
[Ref.  46]),  or  to  Exponentials  by  Lawrance  and  Lewis  [Ref.  6],  In  both 
of  the  cited  papers,  the  data  indicated  that  the  wind  was  almost  always 
blowing.  The  question  is,  however,  how  does  one  model  wind  velocity 
when  there  are  long  calm  periods.  This  is  a  problem  from  Australia  as 
related  by  T.  Lewis  in  the  discussion  of  the  NEAR(2)  model  [Ref.  6].  As 
can  be  seen  in  Figure  III.C.1,  for  small  values  of  l,  highly  correlated 
periods  of  calm  and  wind  can  be  generated  using  the  8,-Beta-Laplace  ARC  1 ) 
model . 

The  preceding  examples  demonstrate  the  opportunities  for  continued 
research  and  are  not  intended  to  narrow  the  focus  of  future  endeavors. 


VI.  SUMMARY  AND  CONCLUSIONS 


We  have  indicated  by  reference  to  the  scientific  literature  that  there 
are  important  application  areas,  especially  in  the  physical  sciences  of  time 
series  whose  marginal  distributions  are  non-Normal.  This  feature,  itself, 
presents  new  problems  in  the  modelling,  study  and  analysis. 

For  those  areas  where  the  non-Normal ity  manifests  itself  primarily  in 
the  thickness  (heaviness)  of  the  tails  of  the  marginal  distributions,  we 
have  demonstrated  that  within  the  2,-Laplace  family  of  distributions,  there 
is  an  appropriate  member  with  which  to  model  phenomenon  with  a  symmetric 
heavy-tailed  marginal  distribution.  The  Z-Laplace  family  has  very  thick 
tails  when  l  is  small  and  a  limiting  Normal  distribution  as  l  increases. 

To  account  for  serial  dependence  in  the  time  series  we  have  derived  two 
families  of  random  processes  that  extends  the  random  coefficient  approach  to 
modelling  non-Normal  time  series.  The  discrete  random  coefficient  models 
(NLARMA(p ,q ) )  have  a  Laplace  marginal  distribution  and  the  continuous  random 
coefficient  models  ( Jl-Beta-Laplace  A  R  (  1  )  and  MA(q))  have  an  8,-Laplace 
marginal  distribution.  Both  families  are  additive  models  and  imitate  the 
linear  Gaussian  models  in  that  they  exhibit  the  usual  ARMA(p,q)  correlation 
structure.  The  models  are  parametrically  parsimonious ,  structurally  simple 
and  easy  to  generate  on  a  computer. 

We  have  also  demonstrated  that  the  fourth-order  residual  analyses  based 
on  the  un correlated ,  but  dependent  sequence  {R^}  are  appropriate  and  useful 
methods  to  discriminate  between  the  discrete  random  coefficient  and  the 
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continuous  random  coefficient  models  when  first,  second  and  third-order 


properties  are  identical. 

For  the  purposes  of  parameter  estimation,  we  derived  the  joint 
probability  density  function.  Numerical  routines  were  written  to  maximize 
the  likelihood  function  to  estimate  the  serial  correlation  coefficient  in 
the  BELAR(I)  and  the  TLAR(1)  processes.  Simulation  results  indicated  that 
this  estimator  was  more  efficient  and  less  biased  than  the  least  squares 
estimator  derived  from  the  linear  residual. 

Finally,  we  summarized  some  of  the  remaining  issues  in  this  field  of 
non-Normal  time  series  analysis.  Extensions  of  the  analyses  in  this  thesis 
which  need  to  be  pursued  are  noted,  along  with  possible  applications  in 
those  previously  mentioned  fields  of  the  physical  sciences. 
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